---
title: "<br><br>How valid are trust survey measures?<br> New insights from open-ended probing data and supervised machine learning<br><br>"
shorttitle: "How valid are trust survey measures?"
author:
  - name: <!-- Camille Landesvatter -->
    affiliation: <!-- Mannheim Centre for European Social Research -->
    email: <!-- <camille.landesvatter@uni-mannheim.de> -->
  - name: <!-- Paul C. Bauer -->
    affiliation: <!-- Mannheim Centre for European Social Research -->
    email: <!-- <mail@paulcbauer.de> -->
date: '<!-- Date: `r format(Sys.time(), "%d %B, %Y")`-->'
abstract: "Trust is a foundational concept of contemporary sociological theory. Still, empirical research on trust relies on a relatively small set of measures. These are increasingly debated, potentially undermining large swathes of empirical evidence. Drawing on a combination of open-ended probing data, supervised machine learning, and a U.S. representative quota sample, our study compares the validity of standard measures of generalized social trust with more recent, situation-specific measures of trust. We find that survey measures that refer to 'strangers' in their question wording best reflect the concept of generalized trust, also known as trust in unknown others. While situation-specific measures should have the desirable property of further reducing variation in associations, i.e., producing more similar frames of reference across respondents, they also seem to increase associations with known others, which is undesirable. In addition, we explore to what extent trust survey questions may evoke negative associations. We find that there is indeed variation across measures, which calls for more research."
keywords: [social trust, generalized trust, survey experiment, open-ended survey questions, text analysis, sentiment analysis, BERT]
#links-to-footnotes: true
paged-footnotes: true
output: 
  pagedown::jss_paged:
    template: wp_paged.html
    self_contained: true
    css: ['wp.css', 'wp-fonts.css', 'wp-page.css']
    csl: american-sociological-association.csl
bibliography: references.bib
knit: pagedown::chrome_print
---

```{r setup, include=FALSE}
## Global chunk options
knitr::opts_chunk$set(
  cache=FALSE,
	echo = FALSE,
	message = FALSE,
	warning = FALSE,
  fig.height = 3.5,
	fig.width = 8,
	dpi = 200
)
options(scipen=999)
options(digits=2)
options(htmltools.dir.version = FALSE)
knitr::opts_chunk$set(dpi=200, fig.width = 8, fig.height = 3.5, fig.align="center")

# Set global thousand separator
knitr::knit_hooks$set(inline = function(x) {
  prettyNum(x, big.mark=",")
})
```

```{r packages}
library(pacman, conflicted)
pacman::p_load(
  tidyverse,
  magrittr,
  tm,
  grid,
  gridExtra,
  stringr,
  kableExtra,
  modelsummary,
  caTools,
  randomForest,
  magick,
  flextable,
  gt,
  pdftools,
  sjstats,
  ROSE,
  rstatix
)
```

<!-- Note: The following code chunks illustrate the data preprocessing steps undertaken before starting our actual analysis. For the sake of transparency, the complete code is retained; however, the publicly accessible files (doi:10.7910/DVN/FJXH5G) do not encompass the necessary components to reproduce these steps. To execute the current .Rmd file, please refer to line ~576 where the fully preprocessed dataset is loaded. Should you find files or objects showcased before line ~576, of interest, kindly request them, and we will gladly provide the requisite information. -->


```{r import-data-preprocessed, eval=F}
# import data with variables that we already preprocessed and anonymized
# e.g. break-offs are removed
# experimental conditions were summarized
# socio-demographic variables were manipulated (age, income, etc.)

data <- read_csv("../data/data_preprocessed.csv")

# important! note to authors: in case of requests only provide anonymized data (i.e., remove all unused sociodemographic columns)
```

```{r standardize-trust, eval=F}
# Standardize trust (min/max scaling)
data <- data %>% 
  mutate(across(c("trust_most_people", "trust_first_time", "trust_stranger",
                  "trust_secret", "trust_loan", "trust_child", "trust_advice"),
                .fns = list(std = ~(. - min(., na.rm = T))/(max(., na.rm = T) - min(., na.rm = T)))))
```

```{r create-data-long-generalized, eval=F}
# Reshape data into long format

data_long_generalized <- data %>%
  select(
    ID_participant,
    group,
    ran_stage1_first_question,
    trust_most_people_std,
    trust_first_time_std,
    trust_stranger_std,
    sex,
    age_cat,
    ethnicity,
    socioeconomic_status,
    income,
    education
  ) %>%
  pivot_longer(
    cols = -c(
      ID_participant,
      group,
      ran_stage1_first_question,
      sex,
      age_cat,
      ethnicity,
      socioeconomic_status,
      income,
      education
    ),
    names_to = "variable",
    values_to = "value"
  ) %>%
  mutate(variable = gsub("_std", "", variable)) %>%
  mutate(variable = factor(
    variable,
    levels = c("trust_most_people",
               "trust_first_time",
               "trust_stranger")
  )) %>%
  mutate(only_block1 = case_when(group == 1 ~ TRUE,
                                 group != 1 ~ FALSE)) %>% mutate(
                                   ran_stage1_first_question_label =
                                     recode(
                                       ran_stage1_first_question,
                                       "1" = "trust_most_people",
                                       "2" = "trust_first_time",
                                       "3" = "trust_stranger"
                                     )
                                 ) %>%
  mutate(
    only_first_gentrust_question = case_when(
      variable == ran_stage1_first_question_label ~ TRUE,
      variable != ran_stage1_first_question_label ~ FALSE
    )
  )

# Probing Data
data_long_probes_generalized <- data %>%
  select(
    ID_participant,
    group,
    ran_stage1_first_question,
    trust_most_people_probe,
    trust_first_time_probe,
    trust_stranger_probe
  ) %>%
  pivot_longer(
    cols = -c(ID_participant,
              group,
              ran_stage1_first_question),
    names_to = "variable",
    values_to = "value"
  ) %>%
  mutate(variable = gsub("_probe", "", variable)) %>%
  rename(probing_answer = value)

# Merge the two long-format datasets
data_long_generalized <- left_join(
  data_long_generalized,
  data_long_probes_generalized %>% select(-group,-ran_stage1_first_question),
  by = c("ID_participant", "variable")
) %>%
  arrange(group, ID_participant, ran_stage1_first_question)

ID <- data_long_generalized$ID_participant
variable <- data_long_generalized$variable
data_long_generalized <-
  data_long_generalized %>% mutate(ID_participant_long = paste(ID, variable, sep =
                                                                 "_"))
```

```{r create-data-long-situative, eval=F}
# Data
data_long_situative <- data %>%
  select(
    ID_participant,
    group,
    ran_stage2_first_question_A,
    ran_stage2_person_stranger,
    trust_secret_std,
    trust_loan_std,
    trust_child_std,
    trust_advice_std,
    sex,
    age_cat,
    ethnicity,
    socioeconomic_status,
    income,
    education
  ) %>%
  pivot_longer(
    cols = -c(
      ID_participant,
      group,
      ran_stage2_first_question_A,
      ran_stage2_person_stranger,
      sex,
      age_cat,
      ethnicity,
      socioeconomic_status,
      income,
      education
    ),
    names_to = "variable",
    values_to = "value"
  ) %>%
  mutate(variable = gsub("_std", "", variable)) %>%
  mutate(variable = factor(
    variable,
    levels = c("trust_secret",
               "trust_loan",
               "trust_child",
               "trust_advice")
  )) %>%
  mutate(only_block2 = case_when(group == 2 ~ TRUE,
                                 group != 2 ~ FALSE)) %>% mutate(
                                   ran_stage2_first_question_label =
                                     recode(
                                       ran_stage2_first_question_A,
                                       "1" = "trust_secret",
                                       "2" = "trust_loan",
                                       "3" = "trust_child",
                                       "4" = "trust_advice"
                                     )
                                 ) %>%
  mutate(
    only_first_sittrust_question = case_when(
      variable == ran_stage2_first_question_label ~ TRUE,
      variable != ran_stage2_first_question_label ~ FALSE
    )
  )

# Probing data
data_long_probes_situative <- data %>%
  select(
    ID_participant,
    group,
    ran_stage2_first_question_A,
    trust_secret_probe_overall,
    trust_loan_probe_overall,
    trust_child_probe_overall,
    trust_advice_probe_overall
  ) %>%
  pivot_longer(
    cols = -c(ID_participant,
              group,
              ran_stage2_first_question_A),
    names_to = "variable",
    values_to = "value"
  ) %>%
  mutate(variable = gsub("_probe_overall", "", variable)) %>%
  rename(probing_answer = value)

# Merge trust scores with probes
data_long_situative <- left_join(
  data_long_situative,
  data_long_probes_situative %>% select(-group,-ran_stage2_first_question_A),
  by = c("ID_participant", "variable")
) %>%
  arrange(group, ID_participant, ran_stage2_first_question_A)


# construct an unique identifier for the long format (participant_ID*question wording)
ID_sit <- data_long_situative$ID_participant
variable_sit <- data_long_situative$variable
data_long_situative <-
  data_long_situative %>% mutate(ID_participant_long = paste(ID_sit, variable_sit, sep =
                                                               "_"))
```

```{r combine-generalized-situative, eval=F}
data <- dplyr::bind_rows(data_long_generalized, data_long_situative) #N=10,500
```

<!-- below: code for creating data that contains our manually labeled observations -->

```{r import-labeled-data, eval=F}
data_known_unknown_manual_codes <-
  read_csv("../data/data_known_unknown_manual_codes.csv")

data_sentiment_manual_codes <-
  read_csv("../data/data_sentiment_manual_codes.csv") %>%
  mutate(manual_code_sentiment_dichotomous=recode(manual_code_sentiment, "-1"="1","0"="0", "1"="0","8"="0", "9"="9")) #create dichotomous sentiment variable
```

```{r add-manual-codes-to-data, eval=F}
data <- dplyr::left_join(data, data_known_unknown_manual_codes[, c("ID_participant_long", "manual_code_known_unknown", "group")], by="ID_participant_long")

data <- dplyr::left_join(data, data_sentiment_manual_codes[, c("ID_participant_long", "manual_code_sentiment")], by="ID_participant_long")

data <- dplyr::left_join(data, data_sentiment_manual_codes[, c("ID_participant_long", "manual_code_sentiment_dichotomous")], by="ID_participant_long")

# Recode and turn to factor for Random Forest function later
data <- data %>% 
  mutate(manual_code_known_unknown = as.factor(as.numeric(recode(manual_code_known_unknown,
                                            "9"="1","8"="0", "1"="1","0"="0")))) %>% 
  mutate(manual_code_sentiment = as.factor(as.numeric(recode(manual_code_sentiment,
                                        "-1"="-1","0"="0", "1"="1","8"="0", "9"= "NA")))) %>%
  mutate(manual_code_sentiment_dichotomous = as.factor(as.numeric(recode(manual_code_sentiment_dichotomous,
                                        "1"="1","0"="0","9"="NA"))))
```

```{r bert-generate-data, eval=FALSE}
# above we created the fine-tuning data for the BERT classifier
# export the above data for BERT

#write_csv(data, "../data/data_for_bert.csv")
```


<!-- below: code for creating test and training data for BERT finetuning -->

```{r train-test-split, eval=F}
data_known_unknown_unbalanced <- data %>% 
  # only use data where probing answers and manual codes are available
  filter(!is.na(probing_answer) & !is.na(manual_code_known_unknown)) %>% 
  # select relevant variables only since ovun.sample can only work with data that has no columns with NAs
  select(ID_participant_long,
         manual_code_known_unknown,
         probing_answer)

data_sentiment_unbalanced <- data %>% 
  filter(!is.na(probing_answer) & !is.na(manual_code_sentiment_dichotomous)) %>% 
  select(ID_participant_long,
         manual_code_sentiment_dichotomous,
         probing_answer)

# split data into train and test

# known-unknown
set.seed(1234)
split_known_unknown = sample.split(data_known_unknown_unbalanced$manual_code_known_unknown, SplitRatio = .7)
train_known_unknown = subset(data_known_unknown_unbalanced, split_known_unknown == TRUE)
test_known_unknown  = subset(data_known_unknown_unbalanced, split_known_unknown == FALSE)

# sentiment
set.seed(1234)
split_sentiment = sample.split(data_sentiment_unbalanced$manual_code_sentiment_dichotomous, SplitRatio = .7)
train_sentiment = subset(data_sentiment_unbalanced, split_sentiment == TRUE)
test_sentiment  = subset(data_sentiment_unbalanced, split_sentiment == FALSE)
```

```{r eval=FALSE, include=FALSE}
# export training (and test) data

# write_csv(train_known_unknown, "../data/training-and-test-data/train_known_unknown.csv")
#
# write_csv(test_known_unknown, "../data/training-and-test-data/test_known_unknown.csv")
# 
# write_csv(train_sentiment, "../data/training-and-test-data/train_sentiment.csv")
#
# write_csv(test_sentiment, "../data/training-and-test-data/test_sentiment.csv")
```




<!-- below: code for supervised categorization with Random Forest -->

```{r rf-corpus, eval=F}
# cast text data into corpus
corpus <- Corpus(VectorSource(data$probing_answer[!is.na(data$probing_answer)])) %>% # remove those without probing answer
  tm_map(removePunctuation, preserve_intra_word_dashes = TRUE) %>% 
  tm_map(removeNumbers) %>% 
  tm_map(content_transformer(tolower)) %>% 
  tm_map(removeWords, stopwords("english")) %>% 
  tm_map(stripWhitespace) %>% 
  tm_map(stemDocument)
```

```{r rf-document-term-matrix, eval=F}
# Create DTM and add targets for random forest algorithm
dtm <- DocumentTermMatrix(corpus) %>% 
  removeSparseTerms(., 0.995) %>% 
  # convert to matrix
  as.matrix({.}) %>% 
  # convert to dataframe
  as.data.frame({.}) %>%
  # add "true" target variable (i.e., code) to the data
  mutate(manual_code_known_unknown=data$manual_code_known_unknown[!is.na(data$probing_answer)]) %>% 
  mutate(manual_code_sentiment=data$manual_code_sentiment[!is.na(data$probing_answer)]) %>%
  mutate(manual_code_sentiment_dichotomous=data$manual_code_sentiment_dichotomous[!is.na(data$probing_answer)])

# Add names to matrix
colnames(dtm) = make.names(colnames(dtm))

# add "long format identifier" to the data 
dtm$ID_participant_long = data$ID_participant_long[!is.na(data$probing_answer)]

# Create DTMs that only contain responses that were manually coded
dtm_labeled_known_unknown <- dtm %>% select(-c(manual_code_sentiment, manual_code_sentiment_dichotomous)) %>% filter(!is.na(manual_code_known_unknown)) #dichotomous classifier

dtm_labeled_sentiment <- dtm %>% select(-c(manual_code_known_unknown, manual_code_sentiment_dichotomous)) %>% filter(!is.na(manual_code_sentiment)) 

dtm_labeled_sentiment_dichotomous <- dtm %>% select(-c(manual_code_known_unknown, manual_code_sentiment)) %>% filter(!is.na(manual_code_sentiment_dichotomous)) 
```

```{r random-forest-classifier, eval=FALSE}
# train classifier based on all manually labeled datapoints
# random forests apply bootstrapping to fit a large number of decision trees
# sampling with replacement (i.e., bootstrapping) ensures that approximately one-third of the data points are out-of-bag (OOB) data
# this OOB data serves as a built-in validation set, eliminating the need for additional splitting of the data into test and training sets

# set.seed(100)
# classifier_known_unknown = randomForest(manual_code_known_unknown ~ ., data=dtm_labeled_known_unknown[,!names(dtm_labeled_known_unknown) %in% "ID_participant_long"], ntree=500)
# 
# set.seed(100)
# classifier_sentiment_dichotomous = randomForest(manual_code_sentiment_dichotomous ~ ., data=dtm_labeled_sentiment_dichotomous[,!names(dtm_labeled_sentiment_dichotomous) %in% "ID_participant_long"], ntree=500) 

# saveRDS(classifier_known_unknown, "../random-forest-models/classifier_known_unknown.rds")
# saveRDS(classifier_sentiment_dichotomous, "../random-forest-models/classifier_sentiment_dichotomous.rds")
```


```{r rf-import-classifier-unbalanced, eval=T}
classifier_known_unknown <- readRDS("../random-forest-models/classifier_known_unknown.rds")
classifier_sentiment_dichotomous <- readRDS("../random-forest-models/classifier_sentiment_dichotomous.rds")
```

```{r rf-evaluation, eval=T}
# Use OOB sample of each tree to calculate error / accuracy rate
# The proportion of times that j is not equal to the true class of n averaged over all cases is the oob error estimate. This has proven to be unbiased in many tests.
accuracy_content <- 1-mean(classifier_known_unknown$err.rate[,1]) 
accuracy_sentiment_dichotomous <- 1-mean(classifier_sentiment_dichotomous$err.rate[,1])
# confusion matrix of the prediction (based on OOB data) with OOB error rate
conf.matrix.content<-classifier_known_unknown$confusion
conf.matrix.sentiment<-classifier_sentiment_dichotomous$confusion
# rows: actual (expected), columns: predicted
```

```{r rf-evaluation-f1-etc, eval=T}
# Precision: tp/(tp+fp):
precision.content<-conf.matrix.content[2,2]/(conf.matrix.content[2,2]+conf.matrix.content[1,2])
precision.sentiment<-conf.matrix.sentiment[2,2]/(conf.matrix.sentiment[2,2]+conf.matrix.sentiment[1,2])
# Recall: tp/(tp + fn):
recall.content<-conf.matrix.content[2,2]/(conf.matrix.content[2,2]+conf.matrix.content[2,1])
recall.sentiment<-conf.matrix.sentiment[2,2]/(conf.matrix.sentiment[2,2]+conf.matrix.sentiment[2,1])
# F-Score: 2 * precision * recall /(precision + recall):
fscore.content <- 2*precision.content*recall.content / (precision.content+recall.content)
fscore.sentiment <- 2*precision.sentiment*recall.sentiment / (precision.sentiment+recall.sentiment)
```

```{r rf-predictions, eval=F}
# Add predictions for all observations in dtm
dtm_known_unknown <- dtm %>% 
  mutate(rf_prediction_known_unknown= as.numeric(as.character(predict(classifier_known_unknown, .)))) %>%
  select(ID_participant_long, rf_prediction_known_unknown)

dtm_sentiment_dichotomous <- dtm %>% 
  mutate(rf_prediction_sentiment_dichotomous=as.numeric(as.character(predict(classifier_sentiment_dichotomous, .))))  %>%
  select(ID_participant_long, rf_prediction_sentiment_dichotomous)

# Merge predictions
data <- left_join(data, dtm_known_unknown, by="ID_participant_long")
data <- left_join(data, dtm_sentiment_dichotomous, by="ID_participant_long")

# Add variable that indicates whether the code was hand-labeled or via ML classfier
data <- data %>% 
  mutate(coding_procedure_known_unknown=factor(ifelse(is.na(manual_code_known_unknown),"ML","manual"))) %>%
  mutate(coding_procedure_sentiment_dichotomous=factor(ifelse(is.na(manual_code_sentiment),"ML","manual")))

# Convert manual codes back to numeric
data <- data %>% 
  mutate(manual_code_known_unknown = as.numeric(as.character(manual_code_known_unknown)),
         manual_code_sentiment_dichotomous = as.numeric(as.character(manual_code_sentiment_dichotomous)))

# final random forest code variable
data <- data %>%  
   mutate(code_known_unknown_with_rf=ifelse(coding_procedure_known_unknown=="ML",rf_prediction_known_unknown, manual_code_known_unknown)) %>% 
   mutate(code_sentiment_dichotomous_with_rf=ifelse(coding_procedure_sentiment_dichotomous=="ML",rf_prediction_sentiment_dichotomous, manual_code_sentiment_dichotomous))
```

<!-- below: code for importing and recoding results from BERT classifier. The actual code with which we fine-tuned the BERT model can be found in "bert_classification_replication.ipynb" -->

```{r import-BERT-data, eval=F}
data_bert <- read_csv("../data/data_bert_predictions.csv") #7492
```

```{r add-BERT-to-data, eval=F}
# add BERT predictions to data using ID_participant_long
data <- left_join(data, data_bert[,c("ID_participant_long", "bert_prediction_known_unknown", "bert_prediction_sentiment")], by="ID_participant_long")
```

```{r BERT-coding, eval=F}
# final (BERT) code variable

data <- data %>%
   mutate(code_known_unknown=ifelse(coding_procedure_known_unknown=="ML",bert_prediction_known_unknown, manual_code_known_unknown)) %>%
   mutate(code_sentiment_dichotomous=ifelse(coding_procedure_sentiment_dichotomous=="ML",bert_prediction_sentiment, manual_code_sentiment_dichotomous))


data$code_known_unknown <- factor(as.character(as.numeric(data$code_known_unknown)),
                                  labels=c("No", "Yes"))
data$code_sentiment_dichotomous<-factor(as.character(as.numeric(data$code_sentiment_dichotomous)),labels=c("neutral/positive", "negative"))

data$code_known_unknown_with_rf <- factor(as.character(data$code_known_unknown_with_rf),
                                  labels=c("No", "Yes"))
data$code_sentiment_dichotomous_with_rf<-factor(as.character(data$code_sentiment_dichotomous_with_rf),labels=c("neutral/positive", "negative"))
```


```{r exclude-sensitive-ids, eval=F}
# additionally, here, we remove open answers that contained sensitive information. The original, published study includes these data points, however for the purpose of this replication material, we had to exclude these from visibility.

sensitive_ids <- read.csv("../data/sensitive_ids.csv") #import from project folder
ids <- sensitive_ids$ID_participant_long
data <- data %>% mutate(sensitive = ifelse(ID_participant_long %in% ids, 1, 0))

data <- data %>% 
  mutate(probing_answer = ifelse(sensitive == 1, "", probing_answer))
```



```{r eval=F}
# store anonymized data alongside sys time in file name
# t <- gsub(":", "-", gsub(" ", "_", Sys.time()))
# write_csv(data, paste0("data", t, ".csv", sep = ""))
```



<!-- START  -->
```{r import data, eval=T}
data <- read.csv("../data/data2024-01-12_09-18-06-anonymized.csv")
```

```{r data-preprocessing, eval=T}
data <- data %>% 
  mutate(variable = recode(variable,
                             "trust_most_people" = "Most people",
                             "trust_first_time" = "People first time",
                             "trust_stranger" = "Stranger",
                             "trust_advice" = "Money advice",
                             "trust_child" = "Watching a loved one",
                             "trust_secret" = "Keeping a secret",
                             "trust_loan" = "Repaying a loan")) %>% 
  mutate(variable = factor(variable, levels= c("Most people",
                                               "People first time",
                                               "Stranger",
                                               "Money advice",
                                               "Watching a loved one",
                                               "Keeping a secret",
                                               "Repaying a loan"))) %>% 
  mutate(stage = ifelse((variable=="Most people" | 
                           variable=="People first time" |
                           variable=="Stranger"),1,2)) %>%
  mutate(first_question=ifelse(is.na(only_first_gentrust_question),only_first_sittrust_question,only_first_gentrust_question)) %>% 
  mutate(subset=ifelse(((stage==1 | stage ==2) & first_question==TRUE), 
                       "Answers to first randomly assigned (probing) question only",
                       "Answers to all (probing) questions")) 


data$code_known_unknown <- factor((data$code_known_unknown),
                                  levels=c("No", "Yes"))

data$code_sentiment_dichotomous <- factor((data$code_sentiment_dichotomous),
                                          levels=c("neutral/positive", 
                                                   "negative"))


data$code_known_unknown_with_rf <- factor((data$code_known_unknown_with_rf),
                                  levels=c("No", "Yes"))

data$code_sentiment_dichotomous_with_rf<-factor((data$code_sentiment_dichotomous_with_rf),levels=c("neutral/positive", "negative"))
```

```{r recode-sociodemographics, eval=T}
# change variable type from character to factor
data <- data %<>% mutate_each_(funs(factor(.)),c("age_cat", "sex", "ethnicity", "education", "socioeconomic_status")) 

# renaming of levels
levels(data$education)[c(1,2,3,4,5,6,7)] <- c('Phd','Graduate', 'High school', 'None', "Secondary", "Technical", "Undergrad")

# change order of levels
data$education <- factor(data$education,levels(data$education)[c(4,3,5,6,7,2,1)])

# turn some variables to numerical
data <- data %>% 
  mutate(education_num = as.numeric(education)) %>% 
  mutate(income_num = as.numeric(income)) %>% 
  mutate(socioeconomic_status_num = as.numeric(socioeconomic_status))
```


\newpage


# Introduction

Generalized social trust is one of the fundamental concepts in contemporary social theory [@Coleman1994-xe; @Herreros2004-yk; @Putnam1994-er; @Smith2010-uc; @Sztompka1999-ho; @Uslaner2002-md; @Schilke2021-sh] and scholarly interest in this concept has grown alongside the increasing number of studies on social capital and social cohesion, as trust is considered a main indicator of these concepts [@Portes2011-du; @Van_Deth2003-fs; @Larsen2013-of]. Consequently, empirical research investigating the causes and consequences of trust has multiplied [@Buskens2000-fv; @Cook2003-ys; @Dinesen2012-nb; @Dinesen2015-kp; @Dinesen2013-xa; @Sonderskov2011-uj]. At the same time, the underlying empirical research program relies on a relatively small set of established survey measures, some of which date back to the 1940s. In recent years, we have seen a growing debate about the validity of these measures, particularly regarding their ability to capture the same concept across all individuals [@Delhey2011-po; @Sturgis2010-sa; @Ermisch2009-qf; @Nannestad2008-fm; @Torpe2011-rb; @Robbins2019-nr; @Delhey2005-sc; @Bauer2018-ex].   
Our study aims to address this debate by investigating the validity of survey measures of generalized social trust. In doing so, we make several contributions to current research. First, we evaluate three classic trust measures in a U.S. sample, thus extending previous work that examined fewer measures using data from the UK [@Sturgis2010-sa; @Sturgis2019-jr]. All three measures have been used to measure generalized social trust, specifically trust in unknown others [@Sonderskov2011-uj; @Uslaner2002-md]. The first measure is known as the "most people question" [@Rosenberg1956-yo], which poses the query "Generally speaking, would you say that most people can be trusted, or that you can’t be too careful in dealing with people?". The second measure, referred to as the "people first time question" [e.g., @Torpe2011-rb], asks respondents about their level of trust in people they meet for the first time. Both of these measures have been established and utilized in numerous large-scale surveys. In contrast, what we call the "stranger question" [@Robbins2019-nr; @Robbins2021-yx], which is "Imagine meeting a total stranger for the first time. Please identify how much you would trust this stranger.", is a more recent alternative and hopeful contender, expected to alleviate some of the problems that appear to characterize the former two. Our study revolves around exploring the validity of these three measures and scrutinizing whether they genuinely measure trust in unknown others, thus identifying possible measurement errors that might influence estimates of trust levels. To achieve this, we designed a survey experiment in which the different measures were randomly assigned to respondents. Our main findings are derived from using open-ended questions that ask about respondents' frames of reference, what we call associations, underlying their response.   
Second, we contrast classic measures of generalized social trust with situative measures of trust. Such measures differ from the classical ones in that they specify a more refined trustee category (e.g., "most people" is replaced with "stranger") as well as some behavior at which the expectation is directed (e.g., "keeping a secret"). Ideally, such measures are able to provide a higher degree of interpersonal comparability since they leave less room for different interpretations by the survey respondents. We are the first to probe such measures and provide evidence on whether validity and comparability increases when these measures are used.   
Third, we explore the sentiment of associations, a dimension that has been neglected so far in trust research. Theory assumes that trust in known others is higher due to effects of in-group bias and reciprocity [@Vollan2011-oc], which is supported by empirical evidence [e.g., @Bauer2018-ex; @Sturgis2010-sa]. However, independently of whether respondents refer to known or unknown others, associations may also vary in terms of their sentiment, for example whether they are positive or negative.   
Fourth, we extend the methodological toolbox that is used to evaluate the validity of survey measures, using a combination of open-ended probing questions  [e.g., @Behr2012-oh; @Behr2017-xu; @Meitinger2022-cp; @Neuert2021-cc]  and automated text analysis [e.g., @Schonlau2016-oq]. The data we labeled and the resulting supervised classifiers we built are suitable for future applications.



# Theory, hypotheses, and previous research   

## Associations with known and unknown others    

Generalized social trust is often referred to as trust in the generalized other and can be described as trust in individuals who are unfamiliar or unknown [@Sonderskov2011-uj; @Uslaner2002-md, 52; @Stolle2015-zv; @Sturgis2010-sa]. @Stolle2015-zv for example emphasizes the need to distinguish the scope of generalized trust from trust toward people one personally knows [@Stolle2015-zv, 398]. Notably, other accounts have chosen to expand the concept of generalized or social trust trust to encompass a wider range of trustees, such as trust "in people in general" [@Yamagishi1994-rk, 146], or as trust in the “average person [one] meets” [@Coleman1994-xe, 104]. Our study, however, uses the understanding of generalized trust that stresses the difference between generalized and particularized trust. Particularized trust is defined as "[...] trust found in close social proximity and extended toward people the individual knows from everyday interactions" [@Freitag2009-kd, 784], including family members, friends, neighbours and co-workers [@Freitag2009-kd, 784] (i.e., known others), whereas generalized trust encompasses "[...] those beyond  immediate  familiarity,  including  strangers" [@Freitag2009-kd, 784] (i.e., unknown others). In this study, we argue that when conceptualizing generalized trust, it should ideally be measured as trust towards unknown others.    
Currently, the measurement of trust primarily relies on survey questions, although behavioral measures and their combination with survey measures have gained popularity [@Ermisch2010-ic; @Barr2003-rs; @Ermisch2009-qf; @Naef2009-dn; @Fehr2002-vj]. Various different questions are used in different large-scale surveys. Undoubtedly, the standard measure is the so-called "most people question" which inquires whether most people can be trusted. Different versions of this question were used in thousands of influential studies and underlying surveys, such as the General Social Survey, the World Values Survey or the European Social Survey.    
However, the measurement of trust using the most people question has been subject of many debates [cf. @Bauer2018-ex] regarding various aspects, such as scale length or balance [@Lundmark2016-qy], and the frames of reference employed by respondents when answering it [@Nannestad2008-fm; @Sturgis2010-sa; @Delhey2014-et]. These frames of reference, what we call associations, are important as they are linked to the conceptual validity of a measure. Conceptual validity increases when the respective survey questions capture generalized trust without specification or measurement error. Figure \@ref(fig:fig-theory) depicts our main argument regarding these associations.   

```{r fig-theory, out.width="80%", fig.align='center',  fig.cap="Variation in associations and trust measurement values", fig.width = 8, fig.height = 6}
magick <- magick::image_read_pdf("../figures/figure-1.pdf",
                       pages = 1)
magick <- magick::image_trim(magick)
magick
```

When employing trustee categories such as "most people" in standard trust measures,  it is probable that distinct associations may arise among different respondents. For instance, in the illustrated example presented in Figure \@ref(fig:fig-theory), respondent Hanna envisions a friend, while Hans envisions a stranger when answering the corresponding survey question. This scenario highlights the ongoing debate on equivalence and whether the concepts in the questions are uniformly interpreted by all respondents [@Bauer2018-ex]. Consequently, due to these varying associations, Hanna's response reflects particularized trust, resulting in a specification error, while Hans's response more closely aligns with the notion of the generalized other. These differences in associations can lead to divergent responses on the trust scale between two individuals (e.g., Hans and Hanna) or even within the same individual at different points in time (depicted by the dashed line in Figure \@ref(fig:fig-theory)).    
Given that the conceptual definition of generalized (and particularized) trust refers to the distinction between known and unknown others, our study aims to identify the associations arising from the specific wording of survey questions. Empirical evidence in that direction is given by @Sturgis2010-sa. In examining the most people question using think-aloud probing, they describe 6 higher-order topics they found respondents to associate with the term "most people". The two largest categories they found by manually classifying responses to their probing question were "known others" (42\%) and "unknown others" (22\%).^[ Smaller categories they found refer to "local community" (e.g., people in their town) (3\%), "job/profession" (e.g., politicians, salesmen) (4\%), "other" (e.g., "trusting is naive") (5\%) and "don't know/no answer" (6\%).] In a similar approach, @Bauer2018-ex surveys student samples from Switzerland using a probe that asks respondents who they had in mind when answering the most people question. The open-ended text answers reveal that “respondents do not necessarily tend to think of strangers or people that are unknown to them. Many think of situations (e.g., meeting someone in the train/street) or of people they know (e.g., friends, family members, etc.)” [@Bauer2018-ex, 9]. Lastly, Uslaner [-@Uslaner2002-md,72-74], as part of the 2000 ANES Pilot Survey, investigated the most people question via  think-aloud techniques and  showed that  58\% of  the respondents referred to a “general worldview” while 23\% mentioned “personal experiences". While personal experiences do not necessarily involve known others, the 2002 ANES data was also coded into more fine-grained categories by Johnson [cf. @ANES_2000-wz]: 8\% of respondents referred to family members, 11\% to co-workers and 12\% to neighbors.    
The present study compares three established measures of generalized social trust, the "most people question" (M1), the "people first time question" (M2) and the "stranger question" (M3). Next to M1, M2 is the second most common generalized trust measure used in many large-scale surveys, such as the World Values Survey or the Socio-Economic Panel in Germany. M3 is a more recent measurement approach, which is not yet part of larger surveys, and was developed with the aim that respondents imagine strangers in their answer [@Robbins2019-nr; @Robbins2021-yx]. Our particular interest for each of these measures lies in the proportion of respondents who think of personally known others (short: known others), when answering expressed as $p_{k} = \frac{1}{n}\sum_{i=1}^{n}Y_i$, where $Y_{i}$ is a dummy that indicates whether individual $i$ thought of known others ($1$) or unknown others ($0$) in their response. Importantly, across the three measures M1--M3, the trustee category is gradually refined. M1 is fairly vague and only refers to most people. M2 already specifies that respondents should think of first-time encounters. M3 further specifies the trustee category by clarifying that the trustee category encompasses strangers. We expect that explicitly referring to "people you meet for the first time" (M2) or "a total stranger you meet for the first time" (M3) as compared to "most people" (M1) may increase the proportion of respondents thinking of others they do not know ($1-p_{k}$). Furthermore, we expect that using the stranger-wording (M3) should increase this share even more than using the people-wording (M2). In our view, the people-wording is more likely to produce associations of situations where the respondent has had first-time encounters with persons that are well-known by now. For instance, respondents may think of a first-time encounter with friends, work colleagues or relatives or first-time encounters with persons who are already connected (e.g. first time meeting the new partner of a sibling). In contrast, the stranger-wording should make it more likely that respondents think about situations in which they really don’t have (or haven’t had) any information about the trustee (e.g., encounters in the street). Eventually, we hypothesize that a refinement of the trustee category (most people $\rightarrow$ people you meet for the first time $\rightarrow$ a total stranger you meet for the first time), decreases the proportion of respondents in whom the association with known people ($p_{k}$) is evoked (H~1~). Evidence for H~1~ would be provided by statistically significant differences between those proportions: $p_{k, M1} > p_{k, M2}$; $p_{k, M1} > p_{k, M3}$; $p_{k, M2}> p_{k, M3}$.   
Additionally, following @Sturgis2010-sa, we also expect that individual associations with known others positively influence trust scores (H~2~) across all three measures. For instance, when calculating the aggregate mean level of trust, $\bar{y} = \frac{1}{n}\sum_{i=1}^{n}y_i$, where $y_i$ is an individual $i$'s reported trust score, we could expect a positive difference in trust between the subset of respondents who think of known others and respondents who think of unknown others. Estimating such differences could help us identify the measurement error that is included in common aggregate estimates of trust scores.

## Negative associations    

While trust research regularly discusses the impact of experiences on trust [@Brehm1997-mv; @Glanville2007-ow; @Freitag2009-kd; @Uslaner2002-md; @Cao2014-im; @Glanville2013-sq; @Dinesen2010-mw], studies about trust measurement have neglected this dimension. On average, trust in known others is higher [@Sturgis2010-sa; @Vollan2011-oc; @Bauer2018-ex] -- as is also evidenced by measures that directly gauge trust in family members, neighbors, etc. [@Nannestad2008-fm; @Freitag2009-kd]. Theoretically, however, this does not always have to be the case. In fact, some of the more important betrayals of trust in our lives may happen through people we know. For instance, a close friend may spill our secrets or a family member may fail to return a loan. Referring to Figure \@ref(fig:fig-theory), Hans's response may be based on a negative association as opposed to Hanna's response. Put differently, we may collect negative (or positive) experiences with known others just as we may collect negative (or positive) experiences with unknown others, i.e., strangers. Independently from whether a trustee is known or unknown, individual associations that emerge when answering survey questions may vary in terms of their sentiment. Hence, we also want to measure the proportion of respondents who have negative associations, expressed as $p_{n} = \frac{1}{n}\sum_{i=1}^{n}Y_{i}$, where $Y_i$ is a dummy that indicates whether individual $i$'s association can be classified as negative ($1$) or not ($0$).^[ Where the latter---0---category comprises both neutral and positive associations.]  
Again, the share of negative associations may depend on the measure we use. Since M2 (in contrast to M1) explicitly asks respondents to think of first-time encounters ("people you meet for the first time"), we expect that this question wording may evoke more negative associations than the most people question. This could be either because respondents remember past first-time interactions that turned out to be negative and/or because we are generally taught to be careful in first-time encounters. M3, then, explicitly specifies the trustee as a stranger. The term "stranger" has a rather negative connotation in English compared to the more neutral terms "people" or "person". "Stranger danger" describes the idea that all strangers can potentially be dangerous. In countries such as Great Britain, stranger-danger education often conducted by local police force has the objective to teach children to refuse offers from strangers [@Moran1997-ud, 11]. Postulating H~1~, we assume that M2 and M3 result in higher conceptual validity (i.e., lower share of associations of known others) which is desirable. However, finding that M3 or M2 in comparison to M1 result in more negative sentiment would be undesirable as it could indicate that using concepts such as "stranger" in M3 affects respondents' mindset.        
We hypothesize that changing trustee categories (most people $\rightarrow$ people you meet for the first time $\rightarrow$ a total stranger you meet for the first time) increases the proportion of respondents who have negative associations ($p_{n}$) (H~3~). Again, evidence for H~3~ would be provided by statistically significant differences between those proportions: $p_{n, M1} > p_{n, M2}$; $p_{n, M1} > p_{n, M3}$; $p_{n, M2} > p_{n, M3}$. We also expect that negative associations should negatively influence trust scores (H~4~) across all three measures. Thus, when calculating the mean level of trust $\bar{y} = \frac{1}{n}\sum_{i=1}^{n}y_i$, where $y_i$ is an individual $i$'s trust score, we expect a negative difference between the subset of respondents who have negative associations and those who do not have negative associations with M1, M2 and M3.


## Situative trust measures   
Empirical operationalizations of generalized trust, for example M1--M3, depict trust as a "one-part relationship, where neither B [the trustee] nor x [expected behavior] enters explicitly" [@Nannestad2008-fm, 415]. In contrast, conceptual work argues that trust is a three-part relationship, in which A (truster) trusts B (trustee) with respect to some behavior X [@Schilke2021-sh; @Cook2005-sy]. 
@Ermisch2009-qf criticize common survey measures of generalized trust to be too generic since the “[...] answers do not reveal either the reference group or the types of action or the stakes that respondents have in mind when making such an assessment” [@Ermisch2009-qf, 750]. Their notion of trust includes a situative character, because they describe a trust situation to be characterized by “trust that someone will do X” [@Ermisch2009-qf, 751; @Ermisch2010-ic, 4].   
The measures we investigate (M4.1--4.4) follow this conceptual work and include the context in which a trust decision takes place. This context entails two components, the trustee category, and the trustee's expected behavior in a certain situation. Importantly, the decision to trust in situation A may not carry over to situation B [@Ermisch2010-ic, 4] even though both situations involve the same trustee.
We argue that situative trust measures may be able to solve some of the problems that characterize the more vague standard measures of generalized trust. Since the latter do not specify either of the two components of context, respondents may simply fill in such specifications themselves.    
Our study investigates situative trust measures introduced by @Robbins2019-nr [@Robbins2021-yx]. These novel measures are based on the stranger question (M3) because they specify the trustee to be a stranger (cf. M3) [see @Buskens2000-fv; @Yamagishi1994-rk; @Yuki2005-lf  for similar approaches]. Further, they specify the expected behavior of the trustee, namely keeping a secret (M4.1), repaying a loan (M4.2), providing advice on managing money (M4.3), and looking after a child/family member/loved one (M4.4). Unlike the stranger question (M3) that allows for varying interpretations by respondents, these situative measures provide a more specific context, leaving less room for ambiguity. This avoids situations where different respondents envision different scenarios, potentially leading to varying trust values (cf. Figure \@ref(fig:fig-theory)).
Analogous to H~1~, we hypothesize that by specifying the trustee as a total stranger, as opposed to most people or people you meet for the first time, the proportion of respondents associating trust with known people ($p_{k}$) will decrease (H~5~).
As these situative measures are relatively new, we do not have specific expectations regarding the negativity of associations they may evoke or how they compare to each other. It is plausible that questions concerning money lending or money advice could elicit negative associations or memories. The question is, however, whether they do so systematically. Therefore, the empirical insights we present below are exploratory in nature.



# Data, experimental design, and methods   

## Sample   

Our target population are U.S. citizens. Data was collected using a two-stage non-probability sample recruited by *Prolific*, a participant recruitment and payment software to conduct online surveys and experiments [@Palan2018-vg]. First, respondents were identified to be eligible according to quotas on self-reported gender, age, and ethnicity in accordance with the U.S. Census Bureau population group estimates from 2015.^[ Gender: two groups, namely males and females; Age: five groups in 10-year brackets; Ethnicity: five groups, namely White, Mixed, Asian, Black, and Other.] Second, out of 43,131 panelists that were considered eligible, we continued to collect data until our target and final sample size of n=`r dplyr::n_distinct(data$ID_participant)` was reached. Respondents who did not complete the questionnaire (n=87, i.e. overall response rate of `r (1-(87/1587))*100`%) were excluded and replaced with other panelists who would fit the quotas. Summary Statistics for all variables and their comparison to population estimates can be found in Online Appendix A.1. The survey was fielded between July 14, 2021 and July 21, 2021. For each completed survey, we paid a wage of $9.60$USD/hr on average while the mean duration was 6.8 minutes.






## Experimental design and measures   

Our questionnaire design is depicted in Table \@ref(tab:tab-design). Respondents provided their data via an online self-administered survey [created using formR, cf. @Arslan2020-wy]. The survey started with information on its objective and a consent form. Subsequently, respondents received two blocks of questions. Block #1 included the standard generalized trust measures with respective probing questions and Block #2 included situative trust measures with respective probing questions. Since we wanted to avoid priming effects (meaning subsequent answers might be influenced by previous questions) we used an experimental design in which the order of questions is randomized. Specifically, the order of Block #1 and #2 as well as the question order within these blocks was randomized. This design allows us to conclude that the differences we find between the trust measures for the outcomes we examined (i.e., the proportion of associations that refer to known individuals or are negative) are actually due to the wording of the question and not to the order of the questions.             
Furthermore, data collected with this questionnaire allows for within- and between-person comparisons for each variable because each respondent received all available trust questions in Block #1 and #2 in a randomized order. To allow further examination of the role of question order despite the introduction of random question order, we can consider two data subsets: Subset 1 only includes respondents' responses to the first trust question they received (ignoring the order of the blocks) and is called "first question only" below; Subset 2 includes respondents' responses to the first trust question from the first block only and is called "first question and first block only" below. While there might still be priming from the preceeding block for Subset 1, this possibility should be excluded for Subset 2.   

```{r tab-design}
table <- data.frame("Intro"= "Information & consent form",
  "Block 1: Generalized trust measures" = "M1: Most people question <br> M2: People first time question <br> M3: Stranger question",
  "Block 2: Situative trust measures" = "M4.1: Keep secret <br> M4.2: Repay loan <br> M4.3: Money advice <br> M4.4: Look after child",
  "Outro"= "Socio-demographics (see Online Appendix A.2)")


names(table) <- c("","","", "")


kable(table, "html", escape = F, align=rep('l', 2),
      caption = "Experimental Design",
      table.attr='class="myTable"',
  col.names = c("<b>Intro</b>", 
                '<b>Block #1:<br> Generalized trust measures</b><br> Randomized question order and probe after all three questions', 
                '<b>Block #2:<br> Situative trust measures</b><br> Randomized question order and probe after question #1 and #4', 
                '<b>Additional questions</b>')) %>%
   kable_styling() %>%
   kable_styling(font_size=10) %>%
   column_spec(column = 1, width = "1in") %>%
   column_spec(column = 2, width = "3in") %>%
   column_spec(column = 3, width = "3in") %>%
   column_spec(column = 4, width = "1in") %>% 
  kable_classic(full_width = F, html_font = "Times New Roman") %>%
  add_header_above(c(" " = 1, "Order of Blocks #1 and #2 is randomized" = 2, " " = 1), bold = TRUE)  %>%
  add_header_above(c("------------------------ Survey direction ----------------------->" = 4), bold = TRUE, align = "c")
```



### Block #1: Generalized trust measures and probing questions    

In Block #1, we assessed generalized trust using three established measures: trust towards "most people" (M1), "people you meet for the first time" (M2), and "a total stranger you meet for the first time" (M3). These measures had different response categories: 7-, 4-, and 4-point scales for M1, M2, and M3, respectively. To ensure comparability, we employed min-max normalization, which rescales the responses to a range between 0 and 1 while preserving the original distribution. We treat the resulting variable as continuous for all our analyses.^[ By introducing this assumption, an ordinal-level measure becomes an interval-level measure with discrete categories [@Blaikie2003-yc]. @Carifio2007-lc and @Glass1972-qy describe how Monte Carlo Simulations have shown that parametric tests, such as a F-Test in a linear regression, are strongly robust to the interval data assumption (as well as moderate skewing) when data  was collected  using  a 5  to  7  point  Likert  response  format   (preferably 7) with no  resulting  bias.] The specific phrasing as well as summary statistics of these questions can be found in the Online Appendix A.2. Directly after respondents answered these closed-ended questions, each was followed by an open-ended probing question using the following wording (exemplary for M1): "In answering the previous question, who came to your mind when you were thinking about 'most people'? Please describe". Our specific interest here is to elicit *who* respondents had in mind when they were exposed to the three different trustee categories.^[ In crafting the above wording, we deliberately chose to repeat the closed-ended question. This decision was based on pretesting the questionnaire with independent testers, considering their feedback, and being guided by relevant literature on probing techniques [e.g., @Behr2012-oh]. Research has shown that repeating the wording can lead to more informative answers compared to presenting the probe without context [@Behr2012-oh]. In principle, repetitions of question wording in probing questions could create demand effects and further research using appropriate randomized designs to study such effects are necessary.]    


### Block #2: Situative trust measures and probing questions    

Block #2 included four situative measures that represent the Imaginary Stranger Trust Scale (IST) developed by Robbins [@Robbins2019-nr; @Robbins2021-yx; @Robbins2022-yn]. These measures specify the trustee category as well as the content of the trust relationship, overall aiming to reduce the vagueness we argued to find for the standard generalized trust measures from Block #1. The four items elicit trust in a total stranger met for the first time to^[ A randomly selected share of respondents was assigned an alternative wording to the one describing the trustee as a stranger met for the first time, namely which describes the trustee as a person met for the first time (question wordings can be found in Online Appendix A.2).], (1) "keep a secret that is damaging to your reputation" (M4.1), (2) "repay a loan of one thousand dollars" (M4.2), (3) "provide advice about how best to manage your money" (M4.3) and to (4) "look after a child, family member, or loved one while you are away" (M4.4). Each of these items was rated on a 4-point scale. We applied min-max normalization to rescale these items to a range between 0 and 1.   
Again, the question order was randomized. Analogous to Block #1, the situative measures were also probed using the following wording: "In answering the previous question, who came to your mind when you were thinking about 'a total stranger you meet for the first time'? Please describe.". To avoid memory effects as well as errors due to response fatigue, we only probed the situative measures that were randomly assigned to come first and fourth.


## Methods    
           
Table \@ref(tab:tab-exemplary-data) illustrates the structure of our data. Due to the intra-person design, there are multiple measures of trust (i.e., 7) (indicated by the column `Measure`) for each respondent alongside their respective trust score (column `Trust`). Overall, we collected open-ended responses using five open-ended probing questions and received `r sum(!is.na(data$probing_answer))` out of potentially 7,500 text answers (column `Probing Answer`).^[ Each respondent was probed for each generalized trust measure (M1 -- M3), resulting in $3x1,500$ entries, as well as for two out of four situative trust measures (M4.1 -- M4.4), resulting in additional $2x1,500$ entries. Out of 10,500 answers to trust questions, 3,000 responses were not probed.] Online Appendix A.3 provides a detailed description of the open-ended text answers. Table \@ref(tab:tab-exemplary-data) also displays the results for our classification of the open-ended responses (columns `Associations (known--unknown others)` and `Associations (sentiment)`). Both approaches are described in detail below.    

```{r tab-exemplary-data}
table_exemplary_data <- data %>% 
  filter(probing_answer!="") %>%
  mutate(ID_participant_long = as.numeric(factor(ID_participant_long))) %>%
  filter(!is.na(value)) %>%
  rename(measure = variable) %>%
  select(ID_participant_long, 
         measure,
         value,
         probing_answer,
         code_known_unknown,
         code_sentiment_dichotomous) %>%
  group_by(ID_participant_long) %>% 
  arrange(ID_participant_long, measure) %>% 
  rename(Person_ID = ID_participant_long) %>%
  mutate(value = round(value,2))%>%
  mutate(across(everything(), as.character)) %>% 
  mutate(Person_ID = as.character(Person_ID))  %>%
  mutate(measure= factor(measure, levels= c("Most people",
                                               "People first time",
                                               "Stranger",
                                               "Keeping a secret",
                                               "Repaying a loan",
                                               "Watching a loved one",
                                               "Money advice")))  %>%
  mutate(code_known_unknown = recode(code_known_unknown,
                                     "Yes" = "1 (Yes)",
                                     "No" = "0 (No)"),
         code_sentiment_dichotomous = recode(code_sentiment_dichotomous,
                                     "negative" = "1 (negative)",
                                     "neutral/positive" = "0 (neutral/positive)")) %>% 
  rename("Associations (sentiment)" = code_sentiment_dichotomous,
         "Associations (known-unknown)" = code_known_unknown,         
         "Probing Answer" = probing_answer,
         ID = Person_ID,
         "Trust   " = value,
         Measure = measure)


# Select observations
table_exemplary_data <- table_exemplary_data %>% arrange(Measure) %>% 
  filter(`Probing Answer`!="") %>% 
  #filter(ID %in% c(4286, 7304, 1365, 7214, 1, 123, 3139, 2980, 1289, 1487))%>%
  filter(ID %in% c(123,3100,7095,7181,1348,2941,1275,1466,4238,1))%>% #use other IDs for anonymized data bc order in the dataset changed due to probing_answer!="" filter
  bind_rows(set_names(rep("...", ncol(.)), colnames(.)))




# Colors
color.me.stranger <- which(table_exemplary_data$Measure=="Stranger")
color.me.peoplefirsttime <- which(table_exemplary_data$Measure=="People first time")
color.me.mostpeople <- which(table_exemplary_data$Measure=="Most people")
color.me.secret <- which(table_exemplary_data$Measure=="Keeping a secret")
color.me.loan <- which(table_exemplary_data$Measure=="Repaying a loan")
color.me.watchingloved <- which(table_exemplary_data$Measure=="Watching a loved one")
color.me.money <- which(table_exemplary_data$Measure=="Money advice")

# Table
kable(table_exemplary_data, format = "html", align=rep('l', 5),
      table.attr='class="myTable"',
      caption = 'Illustration of exemplary data') %>%
  # row_spec(color.me.stranger, background = "#66c2a515") %>%
  # row_spec(color.me.peoplefirsttime, background = "#fc8d6215") %>%
  # row_spec(color.me.mostpeople, background = "#a6d85415")  %>%
  # row_spec(color.me.secret, background = "#f9f2fa") %>%
  # row_spec(color.me.loan, background = "#edfffc") %>%
  # row_spec(color.me.watchingloved, background = "#feffe7") %>% 
  # row_spec(color.me.money, background = "#f2fdec") %>%
  row_spec(color.me.stranger, background = "lightgrey") %>%
  row_spec(color.me.peoplefirsttime, background = "white") %>%
  row_spec(color.me.mostpeople, background = "lightgrey")  %>%
  row_spec(color.me.secret, background = "white") %>%
  row_spec(color.me.loan, background = "lightgrey") %>%
  row_spec(color.me.watchingloved, background = "white") %>%
  row_spec(color.me.money, background = "lightgrey") %>% 
  row_spec(row = 0, bold = TRUE) %>% 
  kable_classic(html_font = "Times New Roman") %>%
  kable_styling(font_size = 10) %>% 
  column_spec(column = 1, width = "1in")%>%
  column_spec(column = 2, width = "1in")%>%
  column_spec(column = 3, width = "1in")%>%
  column_spec(column = 4, width = "3in")%>%
  column_spec(column = 5, width = "1in")%>%
  column_spec(column = 6, width = "1in") %>% 
  add_footnote("Note: The table displays different exemplary respondents. Note that in the actual dataset each respondent/ID (cf column 1) appears seven times, because each respondent received all 7 trust items (for 5 of these questions the respondents received a respective probing question).")
```

Both classifications (i.e., known--unknown and sentiment) were achieved using automated text analysis, which in survey data research has become a popular alternative to manual coding [@Esuli2010-zv; @Giorgetti2003-ei; @gweon2022-ar]. In particular, we pursued a supervised classification approach in which randomly sampled subsets of text answers were manually labeled and only the remainder were automatically classified using fine-tuned BERT models.    
For the known--unknown classification, we manually labeled a sample of n=1,000 text answers, while for the sentiment classification, we increased^[ Detecting sentiment proves more complex than spotting mentions of known and unknown others due to several factors, such as ambiguous word meanings.] this number to n=1,500. Both samples were a random selection of text answers from the generalized trust measures (see Online Appendix A.5.2 for further details). Based on previous implementations in the literature, we argue that these sample sizes are sufficiently large.^[ @Schonlau2016-oq for instance show that 500 observations suffice for training the task of categorizing open-ended survey answers and that additional time savings could be attained by reducing the training data to even 300 or 200 observations, but only for less complex problems. Not only but also because @Schonlau2016-oq are concerned with a multinomial rather than a binary classification problem (i.e., the latter is a less complex task), our training data of n=1,000/1,500 should be large enough. In general, automated categorization is shown to result in meaningful time savings as opposed to manual classification as soon as the data to be classified exceeds 1,500 documents [@Schonlau2016-oq].]    
Both manual classification tasks were achieved using a hand-crafted coding scheme. For both schemes the main distinction lies between two categories. In the known--unknown classification, Category 0 was assigned when respondents mentioned individuals or groups of individuals that can be identified as "unknown others" in their text answer. Importantly, our primary focus was on identifying respondents' personal unfamiliarity with these individuals or groups, and not on the specific characteristics of these individuals/groups. For example, an answer that describes personally unknown others that have rather specific characteristics (i.e., tourists in ID 3139 in Table \@ref(tab:tab-exemplary-data) falls into category 0.^[ Coding of the n=1,000 training data observations shows that circa 9% of the answers include mentions of "groups of people", these instances were all coded as "unknown others".]. Code 1 on the other hand subsumes all statements that made mentions of "others known" to the respondent. Survey answers that had no references to either known or unknown others (e.g., "just people as a whole") were coded as 0, and survey answers with mixed references to both known and unknown others (e.g., "People I may run into everyday.") were coded as 1. To label sentiment, the main distinction lies between "negative sentiment" (Code 1) and "neutral or positive sentiment" (Code 0). Online Appendix A.4 provides an overview of the coding schemes with examples and descriptions of all available codes.    
The manual classification was carried out by three independent coders. All three coders assigned codes to the same 1,000/1,500 text answers, and conflicts were resolved by finding consensus between the coders or using majority vote.         
For the remainder of text answers (i.e., n=6,500/6,000), we fine-tuned the weights of two BERT models (BERT base model in its uncased version), using the manually coded data (n=1,000/1,500) as training data. BERT (Bidirectional Encoder Representations from Transformers) [@Devlin2018-gh] is an empirically powerful machine learning technique that can be used for various natural language processing tasks [@Devlin2018-gh,1]. BERT comes with two attributes that are of special importance here: first, it is able to model contextual representations by incorporating both the left and right context of a document (i.e., bidirectional). Second, BERT provides pre-trained vector representations for words by using a deep, pre-trained neural network. These so-called embeddings suggest a representation for each term based on its context by using information from the entire input sequence. For our data, this could mean, for example, that terms that appear in the (pre-trained) context of “family”, e.g. brother and sister, are likely to be predicted as “known other”.
Last but not least, by using BERT, we aim at addressing the class imbalance that is present in our sentiment data insofar as few respondents (`r prop.table(table(data$manual_code_sentiment_dichotomous))[2]*100`%) have negative associations. BERT achieves higher class-wise accuracy in the presence of class imbalance than other ngram-based machine learning techniques [@gweon2022-ar], and is further demonstrated to remove the need to use data augmentation techniques to mitigate problems of imbalanced data [@Madabushi2020-ur].^[ Still, we attempted oversampling [see e.g., @Gosain2017-gp] the minority class to address the problem of class imbalance. This however did not lead to any further significant improvements. Results are available upon request.] Importantly, the imbalanced data structure and its consequences does not call into question the effects we found but may have resulted in their slight underestimation. Online Appendix A.5.2 shows our findings when using the manually classified data only.    
A detailed evaluation of the two classifiers in terms of accuracy, precision, recall and F1-Score can be found in Table \@ref(tab:model-metrics).


```{r model-metrics}
table_model_metrics_known_unknown <- read.csv("../data/table_model_metrics_known_unknown.csv")[ ,1:4]

table_model_metrics_sentiment <- read.csv("../data/table_model_metrics_sentiment.csv")[ ,1:4]
colnames(table_model_metrics_sentiment) <- paste("sentiment", colnames(table_model_metrics_sentiment), sep = "_")

table_model_metrics <- cbind(table_model_metrics_known_unknown, table_model_metrics_sentiment)


table_model_metrics <- table_model_metrics %>% 
 mutate_if(is.numeric, round, digits=2)

table_model_metrics[,c(1,5)][table_model_metrics[,c(1,5)] == "0.0"] <- "0"
table_model_metrics[,c(1,5)][table_model_metrics[,c(1,5)] == "1.0"] <- "1"

table_model_metrics[3,c(2,3,6,7)]<- ""


gt(table_model_metrics, caption="Accuracy, precision, recall and F1-score") %>% 
  cols_label(X="", precision="Precision", recall="Recall", f1.score="F1 Score", sentiment_X="", sentiment_precision="Precision", sentiment_recall="Recall", sentiment_f1.score="F1 Score") %>% 
    tab_spanner(
    label = "Associations (sentiment)",
    columns = 5:8) %>%  
  tab_spanner(
    label = "Associations
(known-unknown)",
    columns = 1:4) %>%  
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white",
    table.font.names = "Times New Roman"
  )

```




Alternative approaches with which we classified our data (i.e, regular expressions, Random Forest) can be found in Online Appendix A.6.

# Results   

## Trust scores across standard and situative measures    
We begin by assessing the variations in trust scores obtained from our seven trust measures across different sample specifications (Figure \@ref(fig:dotplot-means-subsets)). Regardless of the subsample, there is a gradual decline in trust from Measure 1 (most people question) to Measure 2 (people first time question), and finally to Measure 3 (stranger question).

```{r anova-and-pairwise-comparisons}
# within-subject ANOVA
comparison_data <- data %>% subset(stage==1) #M1-M3 only

anova_test <- anova_test(data=comparison_data, dv = value, wid = ID_participant, within = variable) #within-subject ANOVA
anova<-get_anova_table(anova_test)

# pairwise comparison of all variables
pwc <- data %>%
  pairwise_t_test(
    value ~ variable, paired = TRUE,
    p.adjust.method = "bonferroni")

# create dataframe with p values so that later we can use stat_pvalue_manual to manually add p-values to the ggplot
# unfortunately we cannot rely on ready packages such as ggpubr or ggsignif, because the p values there are not bonferroni corrected and other issues (e.g., p is missing from one label)

p_values <- pwc %>% select(c("group1", "group2", "p"))
p_values <- p_values[c(1,2,7),]

p_values <- p_values %>%
  mutate(
    group1 = recode(group1,
      "Most people" = "M1:\nMost people",
      "People first time" = "M2:\nPeople\nfirst time",
      "Stranger" = "M3:\nStranger"
    )
  )

p_values <- p_values %>%
  mutate(
    group2 = recode(group2,
      "Most people" = "M1:\nMost people",
      "People first time" = "M2:\nPeople\nfirst time",
      "Stranger" = "M3:\nStranger"
    )
  )
```

```{r dotplot-means-subsets, fig.cap="Standardized trust scores across different trust measures and respondent subsets", fig.height=4.5, fig.width=7, message=FALSE, warning=FALSE, out.extra=''}

# data that excludes person-wording for the situative measures
data_stranger_only <- data %>% mutate(value=ifelse((variable=="Money advice" |
                         variable=="Watching a loved one" |
                         variable=="Keeping a secret" |
                         variable=="Repaying a loan") & ran_stage2_person_stranger == 1, NA, value)) #7405 values

levels(data_stranger_only$variable) <-recode(levels(data_stranger_only$variable),
         "Most people" = "M1:\nMost people",
         "People first time" = "M2:\nPeople first time",
         "Stranger" = "M3:\nStranger",
         "Keeping a secret" = "M4.1:\nKeeping a secret",
         "Repaying a loan" = "M4.2:\nRepaying a loan",
         "Watching a loved one" = "M4.3:\nWatching a loved one",
         "Money advice" = "M4.4:\nMoney advice")

levels(data_stranger_only$variable) <- 
str_wrap(levels(data_stranger_only$variable), width = 15)
data_stranger_only$variable <-
gsub(": ", ":\n",
     data_stranger_only$variable)

# construct three different subsets of df
df_plot_1 <- data_stranger_only %>% select(variable, value)

df_plot_2 <- data_stranger_only %>% 
  subset(only_first_gentrust_question==TRUE | only_first_sittrust_question==TRUE) %>% 
  select(variable, value)

df_plot_3 <- data_stranger_only %>% 
  subset((only_block1==TRUE & only_first_gentrust_question==TRUE) | (only_block2==TRUE & only_first_sittrust_question==TRUE)) %>% 
  select(variable, value)

# melt and mutate identifier variable
newData <- reshape2::melt(list(df_plot_1 = df_plot_1, df_plot_2 = df_plot_2, df_plot_3 = df_plot_3), id.vars = "variable") 


# Number of observations
n_fun <- function(x){
  return(data.frame(y = 1, label = paste0("n = ",length(x)))) # mean(x)
}

ggplot(newData %>% filter(!is.na(value)),
       aes(
         x = variable,
         y = value,
         group = L1
       )) +
  stat_summary(
    aes(linetype = L1),
    fun = mean,
    geom = "line",
    position = position_dodge(width = 0.5)
  ) +
  stat_summary(
    aes(group = L1),
    fun.data = mean_cl_normal,
    conf.int = 0.95,
    geom = "linerange",
    lwd = 0.4,
    position = position_dodge(width = 0.5)
  ) +
  stat_summary(
    aes(linetype = L1),
    fun = mean,
    geom = "point",
    size = 1,
    position = position_dodge(width = 0.5)
  ) +
  stat_summary(
    fun.data = n_fun,
    geom = "text",
    angle = 90,
    vjust = 0,
    hjust = 0.8,
    size = 3,
    position = position_dodge(width = 0.5),
    family = "Times New Roman"
  ) +
  labs(y = "Trust score (standardized)", x = "Measure", caption = "Note: The figure shows point estimates for average trust scores and 95% confidence intervals. Details on the respondent subsets are provided in\nthe Methods Section. P-values are derived from t-tests for the Full dataset, for details see footnote 13. Data for M4.1-4.4 include the 'stranger'\nwording only (see footnote 9).") +
  scale_linetype_manual(
    name = "Data",
    labels = c(
      "Full dataset",
      "Subset 1: first question only",
      "Subset 2: first question and first block only"
    ),
    values = c("solid", "dashed", "dotted")
  ) +
  ggpubr::stat_pvalue_manual(
    p_values[,1:3],
    y.position = c(0.55, 0.65, 0.75),
    size = 3,
    label = "p: {scales::pvalue(p)}",
    family = "Times New Roman"
  ) +
  theme_minimal() +
  theme(
    axis.title.x = element_blank(),
    axis.text.x = element_text(angle = 30, hjust = 1),
    legend.position = "bottom",
    plot.caption = element_text(hjust = 0, size = 8),
    text = element_text(family = "Times New Roman")
  ) +
  scale_y_continuous(breaks = seq(0, 1, 0.25),
                     limits = c(0, 1))
```

Within-subjects ANOVA reveals that the generalized trust scores differed statistically significantly for the same individual for the three question wordings (F(`r anova[,"DFn"]`, `r anova[,"DFd"]`)=`r anova[,"F"]`, $p<0.001$).^[ Moreover, we investigated the full dataset via paired sample t-tests with a Bonferroni adjusted alpha level of .016 per test (.05/3): on average, the trust score for M1 (M = `r mean(comparison_data$value[comparison_data$variable == "Most people"], na.rm=T)`, SD = `r sd(comparison_data$value[comparison_data$variable == "Most people"], na.rm=T)`) was significantly higher than the trust score for M3 (M = `r mean(comparison_data$value[comparison_data$variable == "Stranger"], na.rm=T)`, SD = `r sd(comparison_data$value[comparison_data$variable == "Stranger"], na.rm=T)`), t(`r pwc[2,"df"]`) = 13.81, $p<0.001$. Furthermore, but to a lesser extent (as is also depicted in Figure \@ref(fig:dotplot-means-subsets)), M1, on average, results in higher trust scores than M2 (M = `r mean(comparison_data$value[comparison_data$variable == "People first time"], na.rm=T)`, SD = `r sd(comparison_data$value[comparison_data$variable == "People first time"], na.rm=T)`), t(`r pwc[1,"df"]`) = 3.11, $p<0.01$. Also, the differences in trust scores for M2 and M3 are statistically significant, t(`r pwc[7,"df"]`) = 15.15, $p<0.001$.]   
Additionally, situative trust measures M4.1 to M4.4 consistently exhibit lower trust levels likely owing to their emphasis on trust decisions where the truster has a lot to lose.^[ To address potential outliers in individual situations, we propose exploring the concept of "cross-situational trust" [@Bauer2018-ex] and computing an average across measures (see our detailed idea and discussion on this in the conclusion). This approach could help mitigate the impact of strong outliers from specific situations.] It is crucial to note that Figure \@ref(fig:dotplot-means-subsets) provides a descriptive overview of the seven measures concerning their sample means. The observed differences may be influenced by various factors, such as question interpretation, demand effects, and scale effects. In our subsequent analysis, we focus on examining one specific factor: the associations formed by respondents when answering our trust survey questions.

\newpage

## Associations across standard and situative measures    

We start by examining the known--unknown dimension. Figure \@ref(fig:fig-known-people) displays the share of respondents who described associations of either known or unknown others across our seven measures.^[ Online Appendix A.5.2 shows these results using data from the manually coded share of data only (n=1,000/1,500). Online Appendix A.5.3 shows these results using data for Subset 2 only (n=1,500).] In line with our expectation (H~1~), the share of respondents referring to a known other statistically significantly decreases for M3 (i.e., `r data %>% filter(variable=="Stranger") %>% sjstats::prop(code_known_unknown=="Yes")*100`%) while shares for M1 and M2 are similar (`r data %>% filter(variable=="Most people") %>% sjstats::prop(code_known_unknown=="Yes")*100`% and `r data %>% filter(variable=="People first time") %>% sjstats::prop(code_known_unknown=="Yes")*100`%, respectively). The share of respondents referring to a known other again increases for our situative measures M4.1 -- 4.4, however, none of these differences are statistically significant. Nevertheless, it could indicate that referring to specific situations and behaviors in those survey questions could increase the number of respondents who think of known others. This is undesirable from a conceptual perspective.

```{r fig-known-people, fig.cap="Distribution of associations with known people across trust measures"}
# Create new dataset for graph
data2 <- data

levels(data2$variable) <- recode(levels(data2$variable),
         "Most people" = "M1:\nMost people",
         "People first time" = "M2:\nPeople first time",
         "Stranger" = "M3:\nStranger",
         "Keeping a secret" = "M4.1:\nKeeping a secret",
         "Repaying a loan" = "M4.2:\nRepaying a loan",
         "Watching a loved one" = "M4.3:\nWatching a loved one",
         "Money advice" = "M4.4:\nMoney advice")

levels(data2$variable) <- 
str_wrap(levels(data2$variable), width = 15)
data2$variable <-
gsub(": ", ":\n",
     data2$variable)

ggplot(data2 %>% 
         filter(!is.na(code_known_unknown)), 
       aes(x= code_known_unknown, group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)), 
              stat= "count",
              family = "Times New Roman",
              size = 3) + # 4 for presentations
    labs(y = "% of respondents", 
         fill="Associations \n(known others)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0). Data is the full dataset irrespective of the question or block\nrandomization (details are provided in the Methods Section). Results for different Subsets of the data can be found in Online Appendix A.5.2\nand A.5.3.") +
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data$code_known_unknown)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data$code_known_unknown)) + 
  theme_minimal() +
  theme_minimal(base_size = 14) + #presentations
  theme(axis.title.x=element_blank(),
        axis.text.x = element_text(angle = 30, hjust = 1),
        legend.position="bottom",
        #legend.position="right", #presentations
        text = element_text(family = "Times New Roman"), #remove for presentations
        #legend.title = element_text(size=12), #presentations
        #legend.text = element_text(size=12), #presentations
        #strip.text = element_text(size=12), #presentations
        plot.caption = element_text(hjust = 0, size=9))+
  xlab("")
```

With regards to the sentiment dimension, we expected to find different shares of negative sentiment for each question wording (see Figure \@ref(fig:fig-sentiment)). In line with our expectations (H~3~), the share of negative associations is higher for M3 (i.e., `r data %>% filter(variable=="Stranger") %>% sjstats::prop(code_sentiment_dichotomous=="negative")*100`%) compared to M2 (`r data %>% filter(variable=="People first time") %>% sjstats::prop(code_sentiment_dichotomous=="negative")*100`%). Not in line with our hypothesis, the share for M1 is higher (`r data %>% filter(variable=="Most people") %>% sjstats::prop(code_sentiment_dichotomous=="negative")*100`%). However, none of these differences are statistically significant. Moreover, the share of negative associations remains similarly low for the situative measures, which is in accordance with the findings for M3 since the situative measures also describe the trustee category to be a "stranger".   

```{r fig-sentiment, fig.cap="Distribution of associations and their sentiment across trust measures", fig.align="center"}

ggplot(data2   %>% 
    filter(!is.na(code_sentiment_dichotomous)), aes(x= code_sentiment_dichotomous,  
                              group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(
                  #ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
  ymin = ifelse(..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..) < 0, 0, ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)),
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
  
  
    geom_text(aes(label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop.., 
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.25, 2.5, -1.5)),
              stat= "count",
              family = "Times New Roman",
              size = 3) + #4 for presentations
    labs(y = "% of respondents", 
         fill="Associations \n(sentiment)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0). Data is the full dataset irrespective of the question or block\nrandomization (details are provided in the Methods Section).")+
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data$code_sentiment_dichotomous)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data$code_sentiment_dichotomous)) + 
  theme_minimal() +
  #theme_minimal(base_size = 12) + #presentations
  theme(axis.title.x=element_blank(),
        axis.text.x = element_text(angle = 30, hjust = 1), #for presentations 0.8
        legend.position="bottom",
        #legend.position="right", #presentations
        text = element_text(family = "Times New Roman"), #remove for presentations
        #legend.title = element_text(size=12), #presentations
        #legend.text = element_text(size=12), #presentations
        #strip.text = element_text(size=12), #presentations
        plot.caption = element_text(hjust = 0, size=9)) + #left-align+size
  xlab("")
```

```{r pearsons-r}
cor.test <- cor.test(as.numeric(data$code_known_unknown), as.numeric(data$code_sentiment_dichotomous), method="pearson")
```

In sum, we find that, across all seven measures, there are respondents who have associations with known others as well as associations of negative sentiment. However, strong differences between measures in terms of associations can only be found for the known--unknown dimension. The sentiment dimension seems less relevant. The two classification dummies only correlate weakly (r(`r cor.test$parameter`) = `r cor.test$estimate`, p=<0.001).
  

## Associations and trust scores    

Above we demonstrated that there is variation in associations across individuals. Next, we examine whether different associations affect the measurement values. Figure \@ref(fig:coefficient-plot) visualizes the coefficients for a series of regression models (see Table \@ref(tab:tab-reg-1), \@ref(tab:tab-reg-2), \@ref(tab:tab-reg-3), \@ref(tab:tab-reg-4), \@ref(tab:tab-reg-5), \@ref(tab:tab-reg-6) and \@ref(tab:tab-reg-7) in Online Appendix A.9 for detailed regression tables). We estimated five models for each of our seven trust measures which are indicated on the left side. Two models are bivariate and only include one of the association dummies (e.g., Model #1 and #2 in Figure \@ref(fig:coefficient-plot)). We subsequently add covariates to these bivariate regressions (e.g., Model #3 and #4 in Figure \@ref(fig:coefficient-plot)).^[ Age (catgeorical), sex, ethnicity, socioeconomic status, income, and education.] Finally, the fifth model includes both dummies in one model and adds covariates.

```{r coefficient-plot, fig.cap="Associations and trust scores across different measures", fig.align="center", fig.height=8, out.width="90%"}

data_plot1 <- data %>% 
  select(variable, value, code_known_unknown) %>%
  mutate(code_known_unknown = factor(code_known_unknown, ordered = TRUE)) %>%
  nest(data = c(value, code_known_unknown)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_known_unknown, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "no controls")

data_plot1_controls <- data %>% 
  select(variable, value, code_known_unknown, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_known_unknown = factor(code_known_unknown, ordered = TRUE)) %>%
  nest(data = c(value, code_known_unknown, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_known_unknown + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)",
         Variable != "age_cat18-27",
         Variable != "age_cat28-37",
         Variable != "age_cat38-47",
         Variable != "age_cat48-57",
         Variable != "age_cat58-80+",
         Variable != "sexMale",
         Variable != "ethnicityBlack",
         Variable != "ethnicityMixed",
         Variable != "ethnicityOther",
         Variable != "ethnicityWhite",
         Variable != "socioeconomic_status_num",
         Variable != "income_num",
         Variable != "education_num") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "with controls")

data_plot2 <- data %>% 
  select(variable, value, code_sentiment_dichotomous) %>%
  mutate(code_sentiment_dichotomous = factor(code_sentiment_dichotomous, ordered = TRUE)) %>%
  nest(data = c(value, code_sentiment_dichotomous)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_sentiment_dichotomous, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "no controls")

data_plot2_controls <- data %>% 
  select(variable, value, code_sentiment_dichotomous, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_sentiment_dichotomous = factor(code_sentiment_dichotomous, ordered = TRUE)) %>%
  nest(data = c(value, code_sentiment_dichotomous, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_sentiment_dichotomous + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)",
         Variable != "age_cat18-27",
         Variable != "age_cat28-37",
         Variable != "age_cat38-47",
         Variable != "age_cat48-57",
         Variable != "age_cat58-80+",
         Variable != "sexMale",
         Variable != "ethnicityBlack",
         Variable != "ethnicityMixed",
         Variable != "ethnicityOther",
         Variable != "ethnicityWhite",
         Variable != "socioeconomic_status_num",
         Variable != "income_num",
         Variable != "education_num") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "with controls")

data_plot2_both_controls <- data %>% 
  select(variable, value, code_known_unknown, code_sentiment_dichotomous, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_sentiment_dichotomous = factor(code_sentiment_dichotomous, ordered = TRUE)) %>%
  mutate(code_known_unknown = factor(code_known_unknown, ordered = TRUE)) %>%
  nest(data = c(value, code_known_unknown, code_sentiment_dichotomous, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_known_unknown + code_sentiment_dichotomous + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)",
         Variable != "age_cat18-27",
         Variable != "age_cat28-37",
         Variable != "age_cat38-47",
         Variable != "age_cat48-57",
         Variable != "age_cat58-80+",
         Variable != "sexMale",
         Variable != "ethnicityBlack",
         Variable != "ethnicityMixed",
         Variable != "ethnicityOther",
         Variable != "ethnicityWhite",
         Variable != "socioeconomic_status_num",
         Variable != "income_num",
         Variable != "education_num") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "both + controls")

data_regressions <- bind_rows(data_plot1, 
                       data_plot1_controls,
                       data_plot2, 
                       data_plot2_controls,
                       data_plot2_both_controls)

data_regressions <- data_regressions %>% 
  mutate(variable = recode(variable,
                           "Most people" = "M1:\nMost \npeople",
                           "People first time" = "M2:\nPeople first time",
                           "Stranger" = "M3:\nStranger",
                           "Keeping a secret" = "M4.1:\nKeeping a secret",
                           "Repaying a loan" = "M4.2:\nRepaying a loan",
                           "Money advice" = "M4.3:\nMoney advice",
                           "Watching a loved one" = "M4.4:\nWatching a loved one"
                                                    
                                                    )) %>%
  mutate(Model = factor(Model, levels= c("both + controls",
                                      "with controls", 
                                      "no controls"))) %>% 
  mutate(Variable = recode(Variable, 
                                                    "code_known_unknown.L" = "Associations\n(known others = 1)", 
                                                    "code_sentiment_dichotomous.L" = "Associations\n(negative = 1)"
                                                    )) %>%
  mutate(Variable = factor(Variable, levels = c("Associations\n(negative = 1)",
                                                "Associations\n(known others = 1)"),
                           ordered = TRUE))
  

# Reorder measures alphabetically
data_regressions <- data_regressions %>%
    mutate(variable = factor(variable, 
                             ordered = TRUE,
                             levels = sort(levels(data_regressions$variable)))) 


# Add model names
data_regressions <- data_regressions %>% 
  mutate(n_vars = map(data, ncol)) %>%
  arrange(variable, n_vars, desc(Variable), Model) %>%
  mutate(model_name = paste0("#", 1:42))
  #mutate(model_name = paste0("#", 1:18)) # for presentations

# Replace every 6th row with row five 
for(i in c(6,12,18,24,30,36,42)){
#for(i in c(6,12,18)){ # for presentations
data_regressions$model_name[i] <- data_regressions$model_name[i-1]
}
# subsetact from higher numbers (simpler workaround?)
data_regressions <- data_regressions %>% 
  mutate(model_name_num = as.numeric(gsub("#", "", model_name)),
         model_name_num = ifelse(model_name_num>=7, model_name_num-1, model_name_num),
         model_name = paste0("#", model_name_num))
# data_regressions %>% select(variable, Variable, model_name, model_name_num)

library(ggplot2)

ggplot(data_regressions, aes(x = Variable, y = Coefficient, colour = Model, shape = Model)) +
  geom_hline(yintercept = 0, colour = gray(1/2), lty = 2) +
  geom_point(aes(x = Variable, 
                 y = Coefficient), 
             size = 2, 
             position = position_dodge(width = 0.7)) +
  geom_linerange(aes(x = Variable, 
                     ymin = conf.low_90,
                     ymax = conf.high_90),
                 lwd = 1, 
                 position = position_dodge(width = 0.7)) +
  geom_linerange(aes(x = Variable, 
                     ymin = conf.low_95,
                     ymax = conf.high_95),
                 lwd = 1/2, 
                 position = position_dodge(width = 0.7)) +
  geom_text(aes(x = Variable,
                y = conf.high_95 + 0.02,
                label = model_name), 
            position = position_dodge(width = 0.7),
            size = 2.5) +
  ggtitle("Outcome: Trust scores (std. 0-1) for different trust measures") +
  labs(y = "Linear Model Coefficient", 
       caption = "Note: The figure shows point estimates for coefficients of our dummy variables of interest namely having associations with known others or negative\nassociations. Bars represent 90% (thicker) and 95% (thinner) confidence intervals. Data is the full dataset irrespective of the question or block randomization\n(details are provided in the Methods Section).") +
  scale_colour_manual(name = "Model specification",
                      labels = c("both dummies\n+ covariates", "one dummy\n+ covariates", "one dummy\nw/o covariates"),
                      values = c("#ff7f0e", "#9467bd", "#2ca02c")) +
  scale_shape_manual(name = "Model specification",
                     labels = c("both dummies\n+ covariates", "one dummy\n+ covariates", "one dummy\nw/o covariates"),
                     values = c(16, 17, 18)) +  # Use the shape codes that you prefer
  coord_flip() +
  facet_grid(variable ~ .,
             scales = "free",
             space = 'free',
             labeller = label_wrap_gen(width = 8, multi_line = TRUE),
             switch = "y") +
  scale_x_discrete(expand = c(0, 0.2)) +
  xlab("Measure") +
  theme_classic() +
  theme(legend.position = "bottom",
        strip.placement = "outside",
        text = element_text(family = "Times New Roman"),
        panel.spacing = unit(0.7, "lines"))  +
  geom_vline(xintercept = 0.65, linetype = "solid",
             color = "black", size = 0.5) +
  geom_vline(xintercept = 2.35, linetype = "solid",
             color = "black", size = 0.5)

```


```{r reg-results}
#data_plot: 42 observations: for each of the 7 dependent trust variables 2 regressions models (Content + Sentiment) with 3 model specifications (no controls, controls, both+controls), i.e. 6 models per trust variable

mp <- data_regressions %>% filter(variable=="M1:\nMost \npeople")
mp <-mp[,c(1,2,4,5,8,13)]

pft <- data_regressions %>% filter(variable=="M2:\nPeople first time")
pft <-pft[,c(1,2,4,5,8,13)]

sft <- data_regressions %>% filter(variable=="M3:\nStranger")
sft <-sft[,c(1,2,4,5,8,13)]

situative_m4.4 <- data_regressions %>% filter(variable=="M4.4:\nWatching a loved one")
situative_m4.4 <-situative_m4.4[,c(1,2,4,5,8,13)]

situative_m4.2 <- data_regressions %>% filter(variable=="M4.2:\nRepaying a loan")
situative_m4.2 <-situative_m4.2[,c(1,2,4,5,8,13)]
```

In accordance with our expectations (H~2~), we observe that associations with known others have a positive effect on trust for all of our three generalized trust measures M1, M2, and M3 ($\beta_{\#1}$=`r mp$Coefficient[[1]]`; $\beta_{\#6}$=`r pft$Coefficient[[1]]` and $\beta_{\#12}$=`r sft$Coefficient[[1]]`, respectively). While this effect is especially pronounced for M1 and M2 in terms of effect size and statistical significance ($p$<0.001), it becomes smaller and less robust for M3. This may be due to the fact that M3 evokes associations with known people in fewer respondents than M1 an M2 do (see Figure \@ref(fig:fig-known-people)), thus resulting in a smaller sample size of that subgroup, increasing the uncertainty of the corresponding estimate. In addition, adding the sentiment dummy as a control variable in Models $\#5$, $\#10$ and $\#16$  (see Figure \@ref(fig:coefficient-plot)) does not mitigate the effect of the known-unknown dummy on trust.    
In line with our expectation (H~4~), we find that negative associations have a negative effect on trust for all of our three generalized trust measures M1, M2, and M3 regardless of the control set specifications ($\beta_{\#2}$=`r mp$Coefficient[[2]]`, $p$<0.01; $\beta_{\#7}$=`r pft$Coefficient[[2]]`, $p$<0.001 and $\beta_{\#13}$=`r sft$Coefficient[[2]]`, $p$=`r sft$p.value[[4]]`, respectively). While the different generalized trust measures are not affected differently, we suggest that the role of negative associations for trust measurement requires future research.    
Also for the four situative measures, the effects are in line with H~2~. Associations with known people have a positive effect on for example M4.4, trusting someone to watched a loved one ($\beta_{\#36}$ =`r situative_m4.2$Coefficient[[1]]`, $p$ = `r situative_m4.4$p.value[[1]]`), or on M4.2, i.e., trusting someone to repay a loan ($\beta_{\#24}$ =`r situative_m4.2$Coefficient[[1]]`, $p$ = `r situative_m4.2$p.value[[1]]`). For the situative measures, however, while consistent with (H~4~), we find smaller and less robust effects for our dummy capturing negative associations.   
In sum, for the generalized trust measures, we find statistically significant effects in our hypothesized directions, namely that associations with known others (in contrast to unknown others) influences trust scores positively and that negative sentiment (in contrast to neutral/positive sentiment) influences trust scores negatively. Especially the effect of the dummy capturing the known--unknown dimension is undesirable from a conceptual point and its effect varies across measures of generalized trust. We can conclude that estimates based on the three classic measures -- M1, M2 or M3 -- overestimate trust scores because they do not measure generalized trust for a significant share of the respondents. Without these respondents, our estimated trust averages would differ (namely by the coefficients we depict in Figure \@ref(fig:coefficient-plot) for the bivariate models). The bias is smallest for the stranger measure M3 and all four of the situative measures seem to be characterized by the same problem.



# Discussion and conclusion   

Generalized social trust is a foundational concept in the social sciences. However, there have been doubts about the validity of commonly used measures  [@Delhey2011-po; @Sturgis2010-sa; @Ermisch2009-qf; @Nannestad2008-fm; @Robbins2019-nr]. In our study, we examined various trust survey measures in a U.S. sample and explored how respondents answered those questions. To eliminate interviewer effects, we used a web probing approach  [@Behr2012-oh; @Behr2017-xu; @Meitinger2022-cp]. Open-ended probing  [@Neuert2021-cc] is still a novelty in trust research, and similar data has so far only been collected in interviewer-administered settings [@Sturgis2010-sa; @Uslaner2002-md]. The data collected through open-ended probing was analyzed using a supervised machine learning approach.
Our findings can be categorized into four key aspects. First, our study revealed significant variations in overall and intra-individual reported trust levels across different question formats, and the question employing the phrase "most people" yielded the highest average trust score (cf. Figure \@ref(fig:dotplot-means-subsets)). This finding suggests that  the  different question formats should not  be  considered interchangeable measures of generalized trust. However, it is important to note that Figure \@ref(fig:dotplot-means-subsets) provides only a descriptive overview, and our subsequent analysis centered on exploring the associations formed by respondents while answering the trust survey questions.   
Second, we delved into the associations respondents made when responding to the questions. We described generalized trust as  trust in  unknown others, and  argued that it should ideally be measured accordingly.  Remarkably, a notable proportion of respondents (ranging from 13% to 31%, cf. Figure \@ref(fig:fig-known-people)) incorporated thoughts of known individuals in their responses while answering classic trust questions, which is in line with  previous research [e.g., @Sturgis2010-sa]. Hence, for this particular group of respondents, classic trust measures actually do seem to capture what is commonly known as particularized trust [cf. @Freitag2009-kd]. In other words, for these respondents, our measures suffer from construct invalidity. However, the proportion of mentions of known individuals in responses decreased for the "stranger" question (M3), suggesting a higher degree of construct validity for this measure [in line with @Robbins2019-nr; @Robbins2022-yn]. Interestingly, compared to M3, the situative measures (M4.1 - M4.4) showed an increase in respondents thinking about known individuals (but still considerably smaller than in M1 and M2) (cf. Figure \@ref(fig:fig-known-people)), despite being instructed to consider the trustee as a stranger. This outcome may be attributed to respondents drawing upon their past experiences to contextualize and anchor the given situations.   
Thirdly, we conducted an examination of the influence of associations on trust levels. If confirmed, this would imply that trust estimates produced by specific measures (e.g., the "most people" wording) could be biased, potentially leading to an overestimation of generalized trust in diverse populations. Indeed, we found that respondents who reported thinking about known others displayed higher levels of trust across all three generalized trust measures (cf. Figure \@ref(fig:coefficient-plot)). The effects were less robust for the stranger question (M3), which might be due to the smaller share of respondents having known others in mind when answering. This is a desirable feature of the latter measure.^[ Analogous to @Sturgis2010-sa, we randomized respondents to trust measures in Block #1 and #2; hence, we can conclude that the differences in the distribution of associations are the result of divergent frames evoked by the questions in respondents’ minds.] Overall, this finding demonstrates that differences in trust between individuals and over time may not be solely reflective of variation in the substantive dimension of trust. Instead, they might be influenced by specification errors and differences in how respondents interpret the question due to inter-individual differences in frames of reference.    
Fourth, we also explored a hitherto neglected dimension -- the sentiment of association. We found a relatively low proportion of respondents reporting negative associations which remained consistent across measures (cf. Figure \@ref(fig:fig-sentiment)). Against our expectations, M3, the stranger-question (without situations) does not seem to evoke more negative associations than the most people and people first time question. While negative associations did influence trust scores negatively, the effect was not uniform across measures and models (cf. Figure \@ref(fig:coefficient-plot)). These findings offer encouraging insights into measurement, yet we call for further research to explore whether specific question formats trigger more emotional responses or negative memories.
Our study yields several key findings that not only allow us to draw valuable conclusions but also pave the way for future research directions.   
Firstly, among the trust questions we investigated, our various "stranger" questions (M3, and M4.1 to M4.4) demonstrated the highest level of construct validity, as evidenced by the lower share of respondents thinking of known individuals. However, from an empirical perspective, we may question how many trust situations actually take place among total strangers. For example, the four situations in our study are more likely to take place among individuals who have some knowledge about each other (e.g., acquaintances). Certainly  it can be challenging to pinpoint situations that entirely lack associations to known others, but we think that further theoretical work is necessary to classify based on whether a trust measure primarily pertains to strangers or also encompasses acquaintances.^[ It may be beneficial to explore the semantic meaning of the term "stranger" and consider situations where individuals might perceive acquaintances as strangers for specific trust decisions, such as lending money. This highlights the situative nature of trust, where perceptions may vary depending on the context of the interaction [cf. @Hardin2002-mw, 9].]
Secondly, researchers should carefully consider various factors when selecting measures for their studies, aligning with their specific definition of generalized trust. Our findings indicate that M3 best captures generalized trust when defined as trust towards unknown others (cf.  Figure \@ref(fig:fig-known-people)).  However, for those interested in interpersonal comparability, situative measures like the Imaginary Stranger Trust Scale (IST) offer a viable alternative,  since they explicitly define the concrete situation in which trust has to be placed and thus leave less room for different interpretations. Nonetheless, they demand additional questionnaire space due to longer item descriptions.^[ For more detailed considerations between shorter and longer versions of IST, we refer readers to @Robbins2022-yn.] Generally, future studies could make use of additional, situative measuresby using vignette designs. The resulting data could be analyzed in such a way, that one caclulates the average trust across a set of situative trust measures, yielding a score of what we call cross-situational trust [@Bauer2018-ex; @Robbins2022-yn].^[ This approach could extract an individual specific general personal component of trust while acknowledging trust to be inherently situational, mitigate the effects of non-valid associations in single items and provide a more robust assessment of trust across diverse situations. A high-truster would then be someone who has a high-level of trust across a large set of situations that involve trust.] However, we would also like to emphasize that the use of traditional measures such as M1 and M2 may be justified if the main objective is comparability with previous studies using these measures or corresponding panel studies.
Thirdly, our study focused on a U.S. sample, expanding on prior evidence from the UK [@Sturgis2010-sa]. While we expect similar findings in other populations, we lack direct evidence to support this claim. The lack of interpersonal comparability within a "homogeneous" sample of U.S. citizens may be amplified when comparing individuals from different cultures, countries, and languages. Nevertheless, we must exercise caution in generalizing our conclusions to other samples.
Fourthly, the main aim of this study was to examine established measures as they have been used for decades. This implied that we use original wordings characterized by answer scales of different length (e.g., 4pt and 7pt). Although we assume scale length does not significantly affect our main variable of interest (i.e., shares of associations), a potential full-factorial design (7x2) where all seven items are measured with both scales, could explore any subtle differences in greater detail. Also, we  used a  particular set  of emerging measures (i.e., IST [@Robbins2019-nr; @Robbins2021-yx]), and considering other emerging measures, such as the Risk Aversion question in the GSOEP and the UK Household Longitudinal Study^[ "Are you generally a person who is fully prepared to take risks in trusting strangers or do you try to avoid taking such risks?").], could provide valuable insights.
Fifth, we employed a probing technique (see Experimental Design Section) that restated the trustee category originally presented (e.g., "In answering the previous question, who came to your mind when you were thinking about 'most people'?"). Repeating this category could be regarded as a form of priming potentially creating demand effects (cf. Fn 6). For future research, exploring various probing strategies and utilizing designs that provide respondents with as little information as possible, and thereby avoiding any priming, could be a valuable avenue to pursue.    
Finally,  an  open question emerges concerning whether frames of  reference are  systematically linked to  respondents’  demographic characteristics. Preliminary correlational evidence (see Online Appendix A.7) seems to show that this is not the case. This is encouraging and could mean that associations are predominantly random. However, to gain further clarity, future studies could extend the set of covariates considered and potentially employ a randomized design that attempts to induce associations of a particular kind to avoid post-hoc rationalization.

\newpage

# Author's Note

Data and code required to reproduce the findings presented in this study are available in a public repository on Harvard Dataverse (doi:10.7910/DVN/FJXH5G).

To access the data and code, please visit the following link: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FJXH5G.

For any inquiries or assistance related to accessing the materials, readers are encouraged to contact the corresponding author listed in this manuscript.


\newpage

# References

<div id = "refs"></div>

\newpage

# Online Appendix {.reset-counters}



## A.1 Summary statistics

```{r data-summary-stats-trust}
data_standardized_trust <- data %>% 
  select(ID_participant, variable, value) %>%
  pivot_wider(names_from = variable, values_from = value) %>%
  select("M1: Trust most people (std.)" = `Most people`, 
         "M2: Trust people first time (std.)" = `People first time`, 
         "M3: Trust stranger (std.)" = Stranger, 
         "M4.1: Trust stranger secret (std.)" = `Keeping a secret`, 
         "M4.1: Trust stranger loan (std.)" = `Repaying a loan`, 
         "M4.3: Trust stranger child (std.)" = `Watching a loved one`, 
         "M4.4: Trust stranger advice (std.)" = `Money advice`)
```

Below we provide summary statistics for our sample. Our main, long-format dataset has `r nrow(data)` rows because we repeatedly observe our `r length(unique(data$ID_participant))` respondents across 7 trust measures (`1,500*7`).                            
Table \@ref(tab:summary-stats-trust) provides summary statistics for our trust measures which have been standardized to range from 0 to 1. `Unique (#)` describes the number of unique values the variable assumes (including the missing category "NA"). `Missing (%)` describes the percentage of missing values on that variable.^[ The difference in missing values for M1 (n=`r sum(is.na(data_standardized_trust[,1]))`) and M2 (n=`r sum(is.na(data_standardized_trust[,2]))`), as well as M1 (n=`r sum(is.na(data_standardized_trust[,1]))`) and M3 (n=`r sum(is.na(data_standardized_trust[,3]))`) is statistically significant ($p<0.001$).] The corresponding means are also displayed in Figure \@ref(fig:dotplot-means-subsets) (cf. Full dataset).

```{r summary-stats-trust}
datasummary_skim(data_standardized_trust, 
                 type = "numeric", 
                 table.attr = "style='width:100%;'",
                 fmt = 2, 
  title = 'Summary statistics across (standardized) trust scales',
  output = 'gt'
  ) %>%
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white",
    table.font.names = "Times New Roman"
  )
```

Table \@ref(tab:summary-stats-trust-num) and Table
\@ref(tab:summary-stats-trust-cat) present summary statistics for numeric and categorical variables (excluding trust measures), along with population estimates where applicable. For the socio-demographic variables, which remain constant across our various trust measures, we utilized the first slice of our long-format dataset, encompassing all 1,500 respondents, to generate these statistics.    

```{r summary-stats-trust-num}
data_summary_stats <- data %>% 
  filter(variable=="Most people") %>% # only take first 1500
  select(variable, value, code_known_unknown, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  select(-variable, -value, -code_known_unknown)  %>% 
  rename("Age (factor)"=age_cat,
         "Sex (factor)"=sex,
         "Ethnicity (factor)"=ethnicity,
         "Socio-economic status (numeric)"=socioeconomic_status_num,
         "Income (numeric)"=income_num,
         "Education (numeric)"=education_num)

datasummary_skim(data_summary_stats, 
                 type = "numeric",  
                 table.attr = "style='width:100%;'",
                 fmt = 2, 
  title = 'Summary statistics: Numeric covariates',
  output = 'gt'
  ) %>%
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white",
    table.font.names = "Times New Roman"
  )
```

```{r}
t<-datasummary_skim(data_summary_stats, 
                 type = "categorical", 
                 table.attr = "style='width:100%;'",
                 fmt = 1, 
  title = 'Summary statistics: Categorical covariates (factor variables)',
  output = 'gt'
  ) %>%
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white",
    table.font.names = "Times New Roman"
  )

data_census <- read.csv("../data/population data - US 2015 Estimates_from prolific but with own calculations.csv", sep=";")

t[["_data"]][["N (U.S. Census)"]]<-data_census$N
t[["_data"]][["% (U.S. Census)"]]<-data_census$X.
```

```{r summary-stats-trust-cat}
gt(t[["_data"]], caption="Summary statistics: Categorical covariates") %>% 
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white",
    table.font.names = "Times New Roman"
  )
```


\newpage

## A.2 Question wording
Table \@ref(tab:tab-wording) outlines the wording of our different survey measures.

```{r tab-wording}
table <- data.frame("Measure"= c("M1: Most people", "M2: People first time", "M3: Stranger", "M1-M3: Probe",
                                 "M4. ..", "M4.1: Keep a secret", "M4.2: Repay a loan", "M4.3: Look after child", "M4.4: Money advice",
                                 "M4.1-M4.4: Probe",
                                 "Age (factor)", "Sex (factor)", "Ethnicity (factor)", "Socioeconomic status (numeric)", "Income (numeric) (in GBP)^[ Income data is provided by Prolific.ac, a UK-based company, hence income data is in GBP and not USD.]", "Education (numeric)"),
                    
  "Question wording" = c(
    
"Generally speaking, would you say that most people can be trusted, or that you can't be too careful in dealing with people?\n Please tell me on a score of 0 to 6, where 0 means you can't be too careful and 6 means that most people can be trusted.", "How much do you trust people you meet for the first time?", "Imagine meeting a total stranger for the first time. Please identify how much you would trust this stranger.", "In answering the previous question, who came to your mind when you were thinking about 'most people / people you meet for the first time / a total stranger you meet for the first time'? Please describe.", "Imagine meeting a person for the first time / total stranger for the first time. (first question). Please identify how much you would trust this person to...\n Please identify how much you would trust a person you meet for the first /a total stranger you meet for the first time  to... (all following)", "...keep a secret that is damaging to your reputation?", "...repay a loan of one thousand dollars?", "...look after a child, family member, or loved one while you are away?", "...provide advice about how best to manage your money?", "In answering the previous question, who came to your mind when you were thinking about a 'person you meet for the first time'/ 'total stranger you meet for the first time'? Please describe.",
"What is your current age in years?", "What sex were you assigned at birth, such as on an original birth certificate?", "What ethnic group do you belong to?", "Think of a ladder (see image) as representing where people stand in society. At the top of the ladder are the people who are best off—those who have the most money, most education and the best jobs. At the bottom are the people who are worst off—who have the least money, least education and the worst jobs or no job. The higher up you are on this ladder, the closer you are to people at the very top and the lower you are, the closer you are to the bottom. Where would you put yourself on the ladder? Choose the number whose position best represents where you would be on this ladder.", "What is your personal income per year (after tax) in GBP? If you need to convert from another currency you can find a converter [here]", "Which of these is the highest level of education you have completed?"),

  "Response scale & recoding"= c("**Original scale**: 0 - You can't be too careful;
1; 2; 3; 4; 5; 6 - Most people can be trusted; Don’t know; **Recoded scale**: Don't know = NA and values 0-6 standardized to 0-1.", 

"**Original scale**: Do not trust at all; Trust not very much: Trust somewhat; Trust completely; Don’t know; **Recoded scale**: Don't know = NA and values 1-4 standardized to 0-1.", 

"**Original scale**: Do not trust at all; Trust not very much: Trust somewhat; Trust completely; Don’t know; **Recoded scale**: Don't know = NA and values 1-4 standardized to 0-1.", 

"open textbox", 

"**Original scale**: Do not trust at all; Trust not very much: Trust somewhat; Trust completely; Don’t know; **Recoded scale**: Don't know = NA and values 1-4 standardized to 0-1.", "See above.", "See above.", "See above.", "See above.",
"open textbox",
"**Original scale**: Simple numeric entry; **Recoded scale**: Recoded to factor with four levels (1) 17-29, (2) 30-43, (3) 44-59 and (4) 59-93.", 
"**Original scale**: Two answers options 'Male' and 'Female'; **Recoded scale**: Recoded to factor with two levels (1) Female and (2) Male.", 
"**Original scale**: Five answer options 'White', 'Black', 'Asian', 'Mixed' and 'Other'; **Recoded scale**: Recoded to factor with corresponding levels. Reference category is 'Asian'.", 
"**Original scale**: Ten answer options; **Recoded scale**: Numeric with 10 values.", 
"**Original scale**: Answer options 1 - Less than £10,000; 2 - £10,000 - £19,999; 3 - £20,000 - £29,999; 4 - £30,000 - £39,999; 5 - £40,000 - £49,999; 6 - £50,000 - £59,999; 7 - £60,000 - £69,999; 8 - £70,000 - £79,999; 9 - £80,000 - £89,999; 10 - £90,000 - £99,999; 11 - £100,000 - £149,999; 12 - More than £150,000; Rather not say; **Recoded scale**: Numeric with 13 values, Don't know = NA.", "**Original scale**: Answer options 1 - No formal qualifications; 2 - High school diploma/A-levels; 3 - Secondary education (e.g. GED/GCSE), 4 - Technical/community college; 5 - Undergraduate degree (BA/BSc/other); 6 - Graduate degree (MA/MSc/MPhil/other); 7 - Doctorate degree (PhD/other); Don't know / not applicable; **Recoded scale**: Numeric with 7 values, Don't know = NA."))

names(table) <- gsub("\\.", " ", names(table))

kable(table, "html", escape = F, align=rep('l', 2),
      caption = "Question wording",
      table.attr='class="myTable"') %>%
   column_spec(column = 1, width = "1in") %>%
   column_spec(column = 2, width = "5in") %>%
   column_spec(column = 3, width = "5in") %>%
  row_spec(row = 0,
           bold = TRUE,
           align = "center")%>% kable_classic(full_width = F, html_font = "Times New Roman") %>% 
  kable_styling(font_size = 9)

```


## A.3 Open-ended text answers

```{r data-word-count}
variables <- data %>% filter(!data$probing_answer=="") %>%
  select(probing_answer) %>% 
  mutate(word_count=str_count(probing_answer, '\\s+')+1) %>% 
  mutate(nchar = nchar(probing_answer))

data_plot <- variables %>% 
  group_by(word_count) %>%
  summarize(n = n()) %>%
  arrange(word_count) %>% ungroup()
```

Figure \@ref(fig:fig-word-count) displays the distribution of answer lengths for our `r sum(!is.na(data$probing_answer))` open-ended probing answers. On average, respondents used `r mean(variables$word_count)` words (min = `r min(variables$word_count)`, max = `r max(variables$word_count)`, sd = `r sd(variables$word_count)`) for each probe.

```{r fig-word-count, fig.cap="Length of open-ended responses", fig.width=4, fig.height=1}
ggplot(data=data_plot, aes(x=word_count, y=n)) +
  geom_bar(stat="identity")  +
  xlab("Number of words per response") +
  ylab("Number of responses") +
  theme_minimal(base_size = 11) +
  theme(axis.text=element_text(size=5),
        axis.title=element_text(size=5),
        text = element_text(family = "Times New Roman"))
```

```{r eval=F}
ggplot(data=data_plot, aes(x=word_count, y=n)) +
  geom_bar(stat="identity")  +
  xlab("Number of words per response") +
  ylab("Number of responses") +
  theme_minimal(base_size = 16) +
  theme(axis.text=element_text(size=12),
        axis.title=element_text(size=12),
        text = element_text(family = "Times New Roman"))
```

Figure \@ref(fig:fig-top-words) displays the 15 most frequent words by probing question. Besides the overview on which words are commonly used, the side-by-side barplot also depicts which frequent words do not appear for all three measures. For instance, only among answers to the question about most people, the term "family" appears quite frequently.

```{r data-top-words, include=FALSE}
# Create a corpus for each generalized measure
data_mp <- data %>%  subset(variable=="Most people")
corpus_most_people <- Corpus(VectorSource(data_mp$probing_answer))
data_pft <- data %>%  subset(variable=="People first time")
corpus_people <- Corpus(VectorSource(data_pft$probing_answer))
data_sft <- data %>%  subset(variable=="Stranger")
corpus_stranger <- Corpus(VectorSource(data_sft$probing_answer))
# Custom function to clean corpus
clean_corpus <- function(corpus) {
  corpus <- tm_map(corpus, removePunctuation, preserve_intra_word_dashes = TRUE)
  corpus <- tm_map(corpus, removeNumbers)
  corpus <- tm_map(corpus, content_transformer(tolower))
  corpus <- tm_map(corpus, removeWords, stopwords("english"))
  corpus <- tm_map(corpus, stripWhitespace)
  return(corpus)
}
# Apply function to all text corpus
corpus_most_people<-clean_corpus(corpus_most_people)
corpus_people<-clean_corpus(corpus_people)
corpus_stranger<-clean_corpus(corpus_stranger)
 
#generate two-column data frame
dtm <- DocumentTermMatrix(corpus_most_people)
matrix <- as.matrix(dtm) 
words <- sort(colSums(matrix),decreasing=TRUE) 
df <- data.frame(word = names(words),freq=words)
dtm2 <- DocumentTermMatrix(corpus_people) 
matrix2 <- as.matrix(dtm2) 
words2 <- sort(colSums(matrix2),decreasing=TRUE) 
df2 <- data.frame(word = names(words2),freq=words2)
dtm3 <- DocumentTermMatrix(corpus_stranger) 
matrix3 <- as.matrix(dtm3) 
words3 <- sort(colSums(matrix3),decreasing=TRUE) 
df3 <- data.frame(word = names(words3),freq=words3)

df_word <- bind_rows(df %>% select(word, freq) %>% mutate(question = "Most People (M1)"),
                     df2 %>% select(word, freq) %>% mutate(question = "People First Time (M2)"),
                     df3 %>% select(word, freq) %>% mutate(question = "Stranger (M3)")) %>%
  remove_rownames() %>% 
  arrange(question, desc(freq)) %>%
  as_tibble()

# Top words
df_word <- df_word %>% 
  group_by(question) %>% 
  arrange(desc(freq)) %>% 
  slice_max(freq, n = 15) %>%
  mutate(question = factor(question, levels= c("Most People (M1)",
                                               "People First Time (M2)",
                                               "Stranger (M3)")))
```

```{r fig-top-words, fig.cap="Most frequent words in open-ended answers to generalized trust questions", fig.width=6, fig.height=3}
ggplot(df_word, aes(reorder(word,freq), freq)) +
  geom_bar(stat = "identity") + coord_flip() +
  xlab("Terms") + ylab("Frequency") +
  facet_wrap(~question) +
  theme_minimal(base_size = 11) +
  theme(axis.text=element_text(size=6),
        axis.title=element_text(size=6),
        text = element_text(family = "Times New Roman"))
```



\newpage

## A.4 Manual classification: Coding schemes

Tables \@ref(tab:tab-codingscheme-content) and \@ref(tab:tab-codingscheme-sentiment) show descriptions and examples for the different codes. To classify associations referring to known/unknown others, documents of Code **8** were subsumed under Code **0** and documents of Code **9** under Code **1**. For the sentiment analysis, Code **-1** and **0** were combined into one category (**0**). Also, we added Code **8** to this neutral/positive category (**0**). Code 9 was applicable to only very few documents (3 of 1,500) and thus was excluded. These manipulations allow us to examine our hypotheses using a dichotomous classifier (negative vs. neutral/positive sentiment) while at the same time reducing complexity for the classifier.

```{r tab-codingscheme-content}
table <- data.frame(
  "Code" = c("1", "0", "8", "9"),
  "Description" = c("<ul><li><b>known others:</b> includes all mentions of persons the respondent personally knows. This also includes persons the respondent has only met once before (i.e., no stranger no more).</li></ul>",
                    "<ul><li><b>unknown others:</b> includes all mentions of persons the respondent does not know. This also includes descriptions of groups, where the respondent might know some of the persons, however certainly not all.</li></ul>",
                          "<ul><li><b>not applicable:</b> includes all answers that do not refer to Code 1 or Code 0, including non-sense or irrelevant answers (indicators of low response quality).</li></ul>",
                          "<ul><li><b>mixed:</b> not always did respondents decide to only describe known <u>or</u> unknown others but rather made mentions of both: includes all statements that make mentions of both known and unknown persons.</li></ul>"),
  "Examples" = c("<ul><li>Everyone I know</li><li>people I interacted with</li><li>the sum total of all people you know and meet</li><li>A person I met a week ago</li></ul>",
        
        "<ul><li>No one in particular</li><li>just people as a whole</li><li>a random bunch of people</li><li>people that are young like myself</li><li>people in my town</li></ul>",
        
        
        "<ul><li>no one/nothing</li><li>everyone / everybody / anyone</li><li>myself</li><li>don’t know</li></ul>",
        
        
        "<ul><li>those I meet in my everyday activities</li><li>People I interact with on a daily basis; people at work, people at the grocery store, etc.</li><li>People I may run into everyday.</li></ul>"
                 
                 )
)


kable(table, "html", escape = F, align=rep('l', 3),
      table.attr='class="myTable"', 
      caption = 'Coding scheme for associations (known-unknown others)') %>% kable_styling(font_size = 11) %>%
   column_spec(column = 1, width = "0.5in") %>%
   column_spec(column = 2, width = "3in") %>%
   column_spec(column = 3, width = "3in") %>%
  row_spec(row = 0,
           bold = TRUE)%>% kable_classic(full_width = F, html_font = "Times New Roman")
```

```{r tab-codingscheme-sentiment}
table <- data.frame(
  "Code" = c("1", "-1", "0", "8", "9"),
  "Description" = c("<ul><li><b>negative:</b> includes all documents that make use of <u>explicit</u> negative sentiment.</li></ul>",
                          "<ul><li><b>positive:</b> includes all documents that make use of <u>explicit</u> positive sentiment.</li></ul>",
                          "<ul><li><b>neutral:</b> includes all documents that make use of <u>explicit</u> neutral sentiment. Importantly, there has to be enough text to assess that some kind of sentiment is given.
</li></ul>",
                          "<ul><li><b>not applicable:</b> includes all documents in which no sentiment is mentioned or in which it is unclear which sentiment is being associated. This also means that documents that make only implicit (some kind of interpretation is needed) use of sentiment. Also, all documents that do not make use of the previous codes including non-sense answers. These documents could be too short to make an assessment.</li></ul>",
                          "<ul><li><b>mixed:</b> all documents that make <u>explicit</u> use of negative <u>and</u> positive sentiment.</li></ul>"),
  "Examples" = c("<ul><li>Chloe, met her at the gym, asked her to help watch my stuffs while i use the restroom. When i came back, she was gone.
</li><li>Someone doing something behind my back that will jeopardize my wellbeing, my place of residence, or blame me or start stories about me that aren't true. This has happened a few times to me before.</li></ul>",
                 
        "<ul><li>I guess because I live in a city were the population is more dense, the chance of dealing with a wider spectrum of people increases. I can see most encounters would be of a kind person with good intentions, so just about anyone would and could be kind.</li><li>Generally someone that I might have contact with for the first time, and might not ever have contact with again. Someone stopping to give help on the side of the road, for example.</li><li>I just thought of general strangers and how I approach them. In general as long as I don't need to trust them with anything in particular I start with a little trust</li></ul>",
        
        
        "<ul><li>No particular person came to mind. For me when first meeting someone I have to see how the conversation flows. Trust is earned.</li><li>I wouldn’t have any reason to trust them completely but would give them the benefit of doubt</li><li>It depends on what you are trusting the individual for. In general you would trust that the stranger means no harm to you.
</li></ul>",
        
        
        "<ul><li>myself</li><li>don't know</li><li>friends/family/coworker</li><li>OMG</li></ul>",
        
        
        "<ul><li>Most people can't be trusted because people have different thoughts to one another. Some people wants the other people to succeed while some people want the other people to fail or harm them</li><li>By most people, I was thinking about the extremes of people between those who hold themselves to strict high, moral standards regularly, and those who live on impulse with aggression issues and mental instability.</li></ul>"
                 
                 )
)


kable(table, "html", escape = F, align=rep('l', 3),
      table.attr='class="myTable"', 
      caption = 'Coding scheme for associations (sentiment)') %>% kable_styling(font_size = 11) %>%
   column_spec(column = 1, width = "0.5in") %>%
   column_spec(column = 2, width = "3in") %>%
   column_spec(column = 3, width = "3in") %>%
  row_spec(row = 0,
           bold = TRUE)%>% kable_classic(full_width = F, html_font = "Times New Roman")

```


\newpage




## A.5 Automated classification: Evaluation

Below, we assess our automatic classification approach by comparing its results to different subsets of the data: manually coded data only and a data subset that eliminates question order effects.

### A.5.1 Manual vs. automated classification

Figure \@ref(fig:fig-manual-ml-robust) displays distributions of the codes by coding procedure, i.e., manual (n=1,000/1,500) and automated classification (n=6,500/6,000).

```{r fig-manual-ml-robust, fig.cap="Shares of codes by coding procedure (manual vs. ML)"}
p1<-ggplot(data2 %>% filter(!is.na(code_known_unknown)),
       aes(x= code_known_unknown, group=coding_procedure_known_unknown)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)),  
              stat= "count",
              family = "Times New Roman",
              size = 3) +
    labs(y = "% of respondents", 
         fill="Associations \n(known others)")+
    facet_wrap(vars(coding_procedure_known_unknown)) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data$code_known_unknown)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data$code_known_unknown)) + 
  theme(axis.text.x = element_text(angle = 40, hjust = 0.5),
                plot.caption = element_text(hjust = 0))+
  xlab("") +
  theme_minimal(base_size = 11) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position="bottom",
                text = element_text(family = "Times New Roman"))

p2<-ggplot(data2 %>% filter(!is.na(code_sentiment_dichotomous)),
       aes(x= code_sentiment_dichotomous, group=coding_procedure_sentiment_dichotomous)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)), 
              stat= "count",
              family = "Times New Roman",
              size = 3) +
    labs(y = "", 
         fill="Associations \n(sentiment)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0).") +
    facet_wrap(vars(coding_procedure_sentiment_dichotomous)) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data$code_sentiment_dichotomous)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data$code_sentiment_dichotomous)) + 
  theme(axis.text.x = element_text(angle = 40, hjust = 0.5),
                plot.caption = element_text(hjust = 0))+
  xlab("") +
  theme_minimal(base_size = 11) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position="bottom",
        text = element_text(family = "Times New Roman"))

cowplot::plot_grid(p1, p2, ncol = 2, align = "h", axis="l")
```

Generally, we observe an imbalance where for both types of association the code of particular theoretical interest (known people and negative sentiment, respectively) was assigned less often than the reference code (unknown people, neutral/positive sentiment). In the case of the known--unknown classification, using the the BERT classifier results in a even smaller share of the known other code (e.g., classification error). 


### A.5.2 Subset analysis: Manual classification

To additionally examine the robustness of our main findings, Figure \@ref(fig:fig-manual-codes-robust) shows findings for a dataset in which only our manually coded ("gold standard") data is included (n=1,000/N=1,500). Note that for manually coding the known--unknown dimension we drew a sample of answers to M1, while for manually coding the sentiment dimension we drew a sample of answers to M1, M2 and M3 (see Methods Section). Again, both figures show that the codes of substantial interest (i.e., known people and negative experiences) appear more often in our manually coded data while maintaining the same pattern across measures as was shown in the main paper.

```{r fig-manual-codes-robust, fig.cap="Distribution of associations with known people across trust measures", fig.align="center"}
p1 <- ggplot(data2 %>% filter(!is.na(code_known_unknown) & coding_procedure_known_unknown=="manual"), 
       aes(x= code_known_unknown, group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)),  
              stat= "count",
              family = "Times New Roman",
              size = 3) +
    labs(y = "% of respondents", 
         fill="Associations \n(known others)") +
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data2$code_known_unknown)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data2$code_known_unknown)) + 
  theme(axis.text.x = element_text(angle = 40, hjust = 0.5),
                plot.caption = element_text(hjust = 0))+
  theme(axis.text.x = element_text(angle = 40, hjust = 0.5),
                plot.caption = element_text(hjust = 0))+
  xlab("") +
  theme_minimal(base_size = 11) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position="bottom",
        text = element_text(family = "Times New Roman"))


p2 <- ggplot(data2 %>% filter(!is.na(code_sentiment_dichotomous) & coding_procedure_sentiment_dichotomous=="manual"), 
       aes(x= code_sentiment_dichotomous, group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(
                  #ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
  ymin = ifelse(..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..) < 0, 0, ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)),
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
  
  
    geom_text(aes(label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop.., 
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)),
              stat= "count",
              family = "Times New Roman",
              size = 3) +
    labs(y = "", 
         fill="Associations \n(sentiment)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0).") +
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data2$code_sentiment_dichotomous)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data2$code_sentiment_dichotomous)) + 
  theme(axis.text.x = element_text(angle = 40, hjust = 0.5),
                plot.caption = element_text(hjust = 0))+
  xlab("") +
  theme_minimal(base_size = 11) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position="bottom",
        text = element_text(family = "Times New Roman"))

cowplot::plot_grid(p1, p2, ncol = 2, align = "h", axis="l")+expand_limits(y = 900000)

```

```{r regression-robust, include=F}
m1_robust <- data %>% filter(coding_procedure_known_unknown=="manual") %>% 
  lm(value ~ code_known_unknown, data = .)
m1_robust_covariates <- data %>% filter(coding_procedure_known_unknown=="manual") %>% 
  lm(value ~ code_known_unknown + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)

m2_robust <- data %>% filter(coding_procedure_sentiment_dichotomous=="manual") %>% 
  lm(value ~ code_sentiment_dichotomous, data = .)
m2_robust_covariates <- data %>% filter(coding_procedure_sentiment_dichotomous=="manual") %>% 
  lm(value ~ code_sentiment_dichotomous + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)


models <- list("M1" = m1_robust, 
               "M1.1" = m1_robust_covariates, 
               "M2" = m2_robust, 
               "M2.1" = m2_robust_covariates)

regression_notes <- "Notes: Stars indicate signifcance levels +=.1, *=.05, **=.01, ***=0.001."
regression_varnames <- c("age_cat30-43" = "Age (30-43)", 
                             "age_cat44-59" = "Age (44-59)",
                             "age_cat59-93" = "Age (59-93)",
                             "sexMale" = "Sex (Male)",
                             "ethnicityBlack" = "Ethnicity (Black)",
                             "ethnicityMixed" = "Ethnicity (Mixed)",
                             "ethnicityOther" = "Ethnicity (Other)",
                             "ethnicityWhite" = "Ethnicity (White)",
                             "socioeconomic_status_num" = "Socioeconomic status",
                             "income_num" = "Income",
                             "education_num" = "Education",
                             "code_known_unknownYes" = "Association (Known)",
                             "code_sentiment_dichotomousnegative" = "Sentiment (negative)"
                             )

modelsummary(models,
             title = 'Linear regression of trust scores (Y) on associations (Xs)',
             output = 'gt',
             notes = regression_notes,
             stars = TRUE,
             coef_rename = regression_varnames) %>%
    tab_spanner(label = 'Dependent variable: Trust scores (for all measures)', columns = 2:5) %>%
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white"
  )
```

Overall, we can assume that in our main analysis we underestimated the prevalence^[ The possibility that we manually coded a subset of "special" documents (e.g., relatively high share of negative experiences and known others) by chance is ruled out due to the random sampling.] and effects^[ Regression analyses (results available upon request) using the manually coded data only (n=1,000/1,500) yield similar findings as when using the overall data (see Figure \@ref(fig:coefficient-plot); Tables \@ref(tab:tab-reg-1) - \@ref(tab:tab-reg-7)). First, mentioning known others statistically significant increases reported trust scores ($\beta$=`r m1_robust_covariates[["coefficients"]][["code_known_unknownYes"]]`, $p$<0.001; model includes covariates). Second, sentiment in the form of negative experiences statistically significantly decreases reported trust scores ($\beta$=`r m2_robust_covariates[["coefficients"]][["code_sentiment_dichotomousnegative"]]`, $p$<0.001; model includes covariates).] of associations of known people and negative experiences, which only strengthens our overall findings.


### A.5.3 Subset analysis: Subset 2

Figure \@ref(fig:fig-known-people-robust) displays the share of respondents who described associations of either known or unknown others across our seven measures but only for Subset 2, i.e., where n=1,500 and for each question data is used from only those respondents that got the respective question in the very first position of the questionnaire (details are provided in the Methods Section). Findings strongly support the findings we found in the main paper.

```{r fig-known-people-robust, fig.cap="Distribution of associations with known people across trust measures"}
data_subset_2 <-
  data2 %>% subset((only_block1 == TRUE &
                     only_first_gentrust_question == TRUE) |
                    (only_block2 == TRUE & only_first_sittrust_question == TRUE)
  )

ggplot(data_subset_2 %>% 
         filter(!is.na(code_known_unknown)), 
       aes(x= code_known_unknown, group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 3, -2.5)), 
              stat= "count",
              family = "Times New Roman",
              size = 3) + # 4 for presentations
    labs(y = "% of respondents", 
         fill="Associations \n(known others)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0).") +
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data$code_known_unknown)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data$code_known_unknown)) + 
  theme_minimal() +
  #theme_minimal(base_size = 12) + #presentations
  theme(axis.title.x=element_blank(),
        axis.text.x = element_text(angle = 30, hjust = 1),
        legend.position="bottom",
        #legend.position="right", #presentations
        #legend.title = element_text(size=12), #presentations
        #legend.text = element_text(size=12), #presentations
        #strip.text = element_text(size=12), #presentations
        text = element_text(family = "Times New Roman"),
        plot.caption = element_text(hjust = 0, size=9))+
  xlab("")

```


## A.6 Automated classification: Alternatives

Below are alternative approaches we utilized to classify the content (known--unknown) and sentiment of the open-ended answers.

### A.6.1 Classification of associations (known--unknown others) with regular expressions

```{r target-regex}
target <- paste("friend",
"family",
"coworker",
"co-worker",
"neighbor",
"relative",
"boyfriend",
"girlfriend",
"husband",
"wife",
"father",
"mother",
"sister",
"brother", sep = "|")
```

To identify responses that mentioned known others, we additionally automatically detected open-ended responses that contained the following terms: *`r stringr::str_replace_all(paste(target), "\\|", ", ")`*. Figure \@ref(fig:fig-unknown-known-regex) shows findings for the known--unknown categories across our seven trust measures. The emerging pattern mimics the one from our main analysis (cf. Figure \@ref(fig:fig-known-people)).

```{r unknown-known-regex, include=FALSE}
data_reg_ex <- data2 %>% 
  mutate(known_people = str_detect(probing_answer, target)) %>% 
  mutate(known_people = factor(as.numeric(known_people),
                               levels = c(0,1), labels = c("No", "Yes")))
```

```{r fig-unknown-known-regex, fig.cap="Shares of associations with known people across social trust measures"}
ggplot(data_reg_ex %>% 
         filter(!is.na(known_people)), 
       aes(x= known_people, group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)), 
              stat= "count",
              family = "Times New Roman",
              size = 3) + # 5
    labs(y = "% of respondents", 
         fill="Associations \n(known others)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0).") +
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data_reg_ex$known_people)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data_reg_ex$known_people)) + 
  theme(axis.text.x = element_text(angle = 40, hjust = 0.5),
                plot.caption = element_text(hjust = 0))+
  theme_minimal(base_size = 11) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position="bottom",
        text = element_text(family = "Times New Roman"))+
  xlab("")
```

\newpage

### A.6.2 Classification of associations (known–unknown others) and sentiment using random forests

#### Random forest and document-term matrix

Furthermore, we trained two random forest classifiers. To make the text data processable to this machine learning algorithm, we first transformed it into numerical data via *tokenization*, where each unigram (i.e., each unique term used in the open-ended text answer) is one-hot encoded into a separate variable indicating whether or not the respective document (i.e., a certain text answer) contains the unigram of interest. These binary indicators are stored in a Document-Term Matrix, short DTM.^[ In text mining, a DTM is a specific type of a matrix used to represent the frequencies of terms in documents. Typically, a DTM will have $m$ rows and $n$ columns, where $m$ represents the total number of documents and $n$ represents the total number of terms. Each entry $a_ij$ contains the frequency with which term $i$ occurs in document $j$ [@Anandarajan2019-lb].] A glimpse into this representation of text data is given in Table \@ref(tab:tab-dtm). 

```{r tab-dtm}
# select docs in the matrix that correspond to probing answers shown in the table before

corpus_example <- Corpus(VectorSource(table_exemplary_data$`Probing Answer`[1:3])) %>% 
  tm_map(removePunctuation, preserve_intra_word_dashes = TRUE) %>% 
  tm_map(removeNumbers) %>% 
  tm_map(content_transformer(tolower)) %>% 
  tm_map(removeWords, stopwords("english")) %>% 
  tm_map(stripWhitespace) %>% 
  tm_map(stemDocument)

dtm_example <- DocumentTermMatrix(corpus_example) %>%
  as.matrix({.}) %>% 
  as.data.frame({.})

colnames(dtm_example) = make.names(colnames(dtm_example)) #important

dtm_example <- dtm_example[,str_detect(colnames(dtm_example), "dont|know|littl|think|tourist")] %>% as.data.frame()

dtm_example <- bind_cols(
  table_exemplary_data %>% select(`Probing Answer`, `Associations (known-unknown)`, `Associations (sentiment)`) %>% ungroup() %>% slice(1:3),
  dtm_example
) %>% 
  mutate(across(where(is.numeric), as.character))

# add ".."
dtm_example <- dtm_example %>% bind_rows(set_names(rep("..", ncol(.)), colnames(.))) 

x <- data.frame(".." = rep("..", nrow(dtm_example)))

dtm_example <- bind_cols(dtm_example, x)

row.names(dtm_example)[4] <- ".."


# table
kable(dtm_example, format = "html", 
      align=rep('l', 3),
      table.attr='class="myTable"', 
      caption = 'Illustration of exemplary document-term matrix') %>%
  row_spec(row = 0,
           bold = TRUE) %>% 
  #row_spec(c(1,5,6), background = "#f03b2020") %>%
  #row_spec(c(2,3,4), background = "#43a2ca20") %>%
  kable_classic(full_width = F, 
                   html_font = "Times New Roman")

```

We pursued several common steps in pre-processing text data including stemming, transformation to lowercase, removal of punctuation, numbers and common stopwords [e.g., @Kathuria2021-ul]. Also, before we started training the classifier, we removed rare terms and only kept terms that appear in at least 0.5% of the documents. Random forests are commonly used for classifying text because they are algorithmically simple and at the same time provide high levels of performance even for multidimensional data [e.g., @Xu2012-ls; @Li-xin2006-je]. Briefly, the intuition of a random forest classifier is that a large number of simple decision trees (here 500) are fitted to the data. This is achieved through bootstrapping, where new training datasets are created by random sampling from the original data with replacement. Each decision tree is grown using random feature selection.^[ In one of its most popular variants [@Breiman2001-di], the single trees in the forest are constructed by randomly selecting a subspace of features (e.g., 2) at each node of a tree to grow further branches. For clarification, features in the case of text data are terms (see Table \@ref(tab:tab-dtm)).] Importantly, sampling with replacement (i.e., bootstrapping) ensures that approximately one-third of the documents will be out-of-bag (OOB) data [@Breiman2001-di, 11]. This OOB data serves as a built-in validation set, eliminating the need for additional splitting of the data into test and training sets.      
The task of classifying new data is done by bagging methods. More explicitly, each new datapoint $d$ (i.e., document) is passed down each tree following the logic of a simple decision tree. Results from doing this for all trees are aggregated and $d$ is assigned its prediction via majority vote.

Using the above representation of data, we trained two classifiers. For evaluation, the OOB error rate (averaged over all boostrapped datasets) provides an unbiased measure of accuracy [@Breiman2001-di, 11]. Classifying the known--unknown dimension achieves an overall OOB accuracy rate of `r accuracy_content`. The classifier for sentiment achieves an overall OOB accuracy rate of `r accuracy_sentiment_dichotomous`.

Table \@ref(tab:tab-exemplary-data-rf) shows a glimpse into exemplary documents that were classified with the Random Forest classifiers (cf. Table \@ref(tab:tab-exemplary-data) in the main paper).

```{r tab-exemplary-data-rf}
table_exemplary_data <- data %>% 
  filter(probing_answer!="") %>%
  mutate(ID_participant_long = as.numeric(factor(ID_participant_long))) %>%
  filter(!is.na(value)) %>%
  rename(measure = variable) %>%
  select(ID_participant_long, 
         measure,
         value,
         probing_answer,
         code_known_unknown_with_rf,
         code_sentiment_dichotomous_with_rf) %>%
  group_by(ID_participant_long) %>% 
  arrange(ID_participant_long, measure) %>% 
  rename(Person_ID = ID_participant_long) %>%
  mutate(value = round(value,2))%>%
  mutate(across(everything(), as.character)) %>% 
  mutate(Person_ID = as.character(Person_ID))  %>%
  mutate(measure= factor(measure, levels= c("Most people",
                                               "People first time",
                                               "Stranger",
                                               "Keeping a secret",
                                               "Repaying a loan",
                                               "Watching a loved one",
                                               "Money advice")))  %>%
  mutate(code_known_unknown_with_rf = recode(code_known_unknown_with_rf,
                                     "Yes" = "1 (Yes)",
                                     "No" = "0 (No)"),
         code_sentiment_dichotomous_with_rf = recode(code_sentiment_dichotomous_with_rf,
                                     "negative" = "1 (negative)",
                                     "neutral/positive" = "0 (neutral/positive)")) %>% 
  rename("Associations (sentiment)" = code_sentiment_dichotomous_with_rf,
         "Associations (known-unknown others)" = code_known_unknown_with_rf,         
         "Probe" = probing_answer,
         ID = Person_ID,
         "Trust   " = value,
         Measure = measure)


# Select observations
table_exemplary_data <- table_exemplary_data %>% arrange(Measure) %>% 
  filter(`Probe`!="") %>% 
  #filter(ID %in% c(1756, 7304, 1365, 7214, 1, 123, 3139, 2980, 1289, 1487))%>% 
  filter(ID %in% c(123,3100,7095,7181,1348,2941,1275,1466,4238,1))%>% #use other IDs for anonymized data bc order in the dataset changed due to probing_answer!="" filter
  bind_rows(set_names(rep("...", ncol(.)), colnames(.)))

# Colors
color.me.stranger <- which(table_exemplary_data$Measure=="Stranger")
color.me.peoplefirsttime <- which(table_exemplary_data$Measure=="People first time")
color.me.mostpeople <- which(table_exemplary_data$Measure=="Most people")
color.me.secret <- which(table_exemplary_data$Measure=="Keeping a secret")
color.me.loan <- which(table_exemplary_data$Measure=="Repaying a loan")
color.me.watchingloved <- which(table_exemplary_data$Measure=="Watching a loved one")
color.me.money <- which(table_exemplary_data$Measure=="Money advice")

# Table
kable(table_exemplary_data, format = "html", align=rep('l', 5),
      table.attr='class="myTable"', 
      caption = 'Illustration of exemplary data') %>%
  row_spec(row = 0,
           bold = TRUE) %>%
  # row_spec(color.me.stranger, background = "#66c2a515") %>%
  # row_spec(color.me.peoplefirsttime, background = "#fc8d6215") %>%
  # row_spec(color.me.mostpeople, background = "#a6d85415")  %>%
  # row_spec(color.me.secret, background = "#f9f2fa") %>%
  # row_spec(color.me.loan, background = "#edfffc") %>%
  # row_spec(color.me.watchingloved, background = "#feffe7") %>% 
  # row_spec(color.me.money, background = "#f2fdec") %>%
  row_spec(color.me.stranger, background = "lightgrey") %>%
  row_spec(color.me.peoplefirsttime, background = "white") %>%
  row_spec(color.me.mostpeople, background = "lightgrey")  %>%
  row_spec(color.me.secret, background = "white") %>%
  row_spec(color.me.loan, background = "lightgrey") %>%
  row_spec(color.me.watchingloved, background = "white") %>%
  row_spec(color.me.money, background = "lightgrey") %>% 
  kable_classic(html_font = "Times New Roman") %>%
  kable_styling(font_size = 10) %>% #16
   column_spec(column = 4, width = "3in")%>%
   column_spec(column = 5, width = "1in")%>%
   column_spec(column = 6, width = "2in")
```

Figure \@ref(fig:fig-known-people-rf) shows findings for the share of known and unknown others for our seven measures. The emerging pattern mimics the one from our main analysis (cf. Figure \@ref(fig:fig-known-people)).

```{r fig-known-people-rf, fig.cap="Distribution of associations with known people across trust measures"}
# Create new dataset for graph
ggplot(data2 %>% 
         filter(!is.na(code_known_unknown_with_rf)), 
       aes(x= code_known_unknown_with_rf, group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
    geom_text(aes( label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop..,
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -1.5)), 
              stat= "count",
              family = "Times New Roman",
              size = 3) + # 4 for presentations
    labs(y = "% of respondents", 
         fill="Associations \n(known others)",
         caption = "Note: Error bars represent 95% confidence intervals (lower cutoff at 0). Data is the full dataset irrespective of the question or\nblock randomization (details are provided in the Methods Section).") +
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data2$code_known_unknown_with_rf)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data2$code_known_unknown_with_rf)) + 
  theme_minimal() +
  #theme_minimal(base_size = 12) + #presentations
  theme(axis.title.x=element_blank(),
        axis.text.x = element_text(angle = 30, hjust = 1),
        legend.position="bottom",
        #legend.position="right", #presentations
        #legend.title = element_text(size=12), #presentations
        #legend.text = element_text(size=12), #presentations
        #strip.text = element_text(size=12), #presentations
        text = element_text(family = "Times New Roman"),
        plot.caption = element_text(hjust = 0, size=9))+
  xlab("")
```

Figure \@ref(fig:fig-sentiment-rf) shows findings for the share of negative and neutral/positive for our seven measures. The emerging pattern mimics the one from our main analysis (cf. Figure \@ref(fig:fig-sentiment)).

```{r fig-sentiment-rf, fig.cap="Distribution of associations and their sentiment across trust measures", fig.align="center"}

# levels(data2$code_sentiment_dichotomous_with_rf) <- gsub("neutral/positive", "neutral/positive (=0)", levels(data2$code_sentiment_dichotomous_with_rf))
# levels(data2$code_sentiment_dichotomous_with_rf) <- gsub("negative", "negative (=1)", levels(data2$code_sentiment_dichotomous_with_rf))

ggplot(data2   %>% 
    filter(!is.na(code_sentiment_dichotomous_with_rf)), aes(x= code_sentiment_dichotomous_with_rf,  
                              group=variable)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count", width=0.9) +
geom_errorbar(aes(
                  #ymin = ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..), 
  ymin = ifelse(..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..) < 0, 0, ..prop.. - qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)),
                  ymax = ..prop.. + qnorm(.975)*sqrt(..prop..*(1-..prop..)/..count..)), 
              stat = 'count', width=0.4) +
  
  
    geom_text(aes(label = scales::percent(..prop.., accuracy = 1),
                   y= ..prop.., 
              #hjust = ifelse(..prop..>0.5, 0.4, -0.1), 
              vjust = ifelse(..prop..>0.5, 2.5, -2)), 
              stat= "count",
                            family = "Times New Roman",
              size = 3) + #4 for presentations
    labs(y = "% of respondents", 
         fill="Associations \n(sentiment)")+
    facet_wrap(vars(variable), ncol = 7) +
    scale_y_continuous(labels = scales::percent, limits = c(0,1)) +
  scale_fill_manual(values = c("gray68", "red2"), labels = levels(data2$code_sentiment_dichotomous_with_rf)) +
  scale_color_manual(values = c("gray68", "red2"), labels = levels(data2$code_sentiment_dichotomous_with_rf)) + 
  theme_minimal() +
  #theme_minimal(base_size = 12) + #presentations
  theme_minimal(base_size = 11) +
  theme(axis.title.x=element_blank(),
        axis.text.x = element_text(angle = 30, hjust = 1), #for presentations 0.8
        legend.position="bottom",
        #legend.position="right", #presentations
        text = element_text(family = "Times New Roman"), #remove for presentations
        #legend.title = element_text(size=12), #presentations
        #legend.text = element_text(size=12), #presentations
        #strip.text = element_text(size=12), #presentations
        plot.caption = element_text(hjust = 0, size=9)) + #left-align+size
  xlab("")
```

Figure \@ref(fig:coefficient-plot-rf) shows findings for the regression analysis of associations on trust scores. The emerging pattern mimics the one from our main analysis (cf. Figure \@ref(fig:coefficient-plot)).

```{r coefficient-plot-rf, fig.cap="Associations and trust scores across different measures", fig.align="center", fig.height=8, out.width="90%"}

data_plot1 <- data %>% 
  select(variable, value, code_known_unknown_with_rf) %>%
  mutate(code_known_unknown_with_rf = factor(code_known_unknown_with_rf, ordered = TRUE)) %>%
  nest(data = c(value, code_known_unknown_with_rf)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_known_unknown_with_rf, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "no controls")

data_plot1_controls <- data %>% 
  select(variable, value, code_known_unknown_with_rf, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_known_unknown_with_rf = factor(code_known_unknown_with_rf, ordered = TRUE)) %>%
  nest(data = c(value, code_known_unknown_with_rf, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_known_unknown_with_rf + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)",
         Variable != "age_cat18-27",
         Variable != "age_cat28-37",
         Variable != "age_cat38-47",
         Variable != "age_cat48-57",
         Variable != "age_cat58-80+",
         Variable != "sexMale",
         Variable != "ethnicityBlack",
         Variable != "ethnicityMixed",
         Variable != "ethnicityOther",
         Variable != "ethnicityWhite",
         Variable != "socioeconomic_status_num",
         Variable != "income_num",
         Variable != "education_num") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "with controls")

data_plot2 <- data %>% 
  select(variable, value, code_sentiment_dichotomous_with_rf) %>%
  mutate(code_sentiment_dichotomous_with_rf = factor(code_sentiment_dichotomous_with_rf, ordered = TRUE)) %>%
  nest(data = c(value, code_sentiment_dichotomous_with_rf)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_sentiment_dichotomous_with_rf, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "no controls")

data_plot2_controls <- data %>% 
  select(variable, value, code_sentiment_dichotomous_with_rf, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_sentiment_dichotomous_with_rf = factor(code_sentiment_dichotomous_with_rf, ordered = TRUE)) %>%
  nest(data = c(value, code_sentiment_dichotomous_with_rf, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_sentiment_dichotomous_with_rf + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)",
         Variable != "age_cat18-27",
         Variable != "age_cat28-37",
         Variable != "age_cat38-47",
         Variable != "age_cat48-57",
         Variable != "age_cat58-80+",
         Variable != "sexMale",
         Variable != "ethnicityBlack",
         Variable != "ethnicityMixed",
         Variable != "ethnicityOther",
         Variable != "ethnicityWhite",
         Variable != "socioeconomic_status_num",
         Variable != "income_num",
         Variable != "education_num") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "with controls")

data_plot2_both_controls <- data %>% 
  select(variable, value, code_known_unknown_with_rf, code_sentiment_dichotomous_with_rf, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_sentiment_dichotomous_with_rf = factor(code_sentiment_dichotomous_with_rf, ordered = TRUE)) %>%
  mutate(code_known_unknown_with_rf = factor(code_known_unknown_with_rf, ordered = TRUE)) %>%
  nest(data = c(value, code_known_unknown_with_rf, code_sentiment_dichotomous_with_rf, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num)) %>% 
  mutate(fit = map(data, ~ lm(value ~ code_known_unknown_with_rf + code_sentiment_dichotomous_with_rf + age_cat + sex + ethnicity + socioeconomic_status_num + income_num + education_num, data = .)),
         results = map(fit, tidy),
         results_90 = map(fit, level = 0.90, confint),
         results_95 = map(fit, level = 0.95, confint)) %>%
       mutate(results_90 = map(results_90, ~ data.frame(.)),
            results_95 = map(results_95, ~ data.frame(.))) %>%
  unnest(c(results, results_90, results_95)) %>%
  rename(Variable = term,
                  Coefficient = estimate,
                  SE = std.error) %>%
  filter(Variable != "(Intercept)",
         Variable != "age_cat18-27",
         Variable != "age_cat28-37",
         Variable != "age_cat38-47",
         Variable != "age_cat48-57",
         Variable != "age_cat58-80+",
         Variable != "sexMale",
         Variable != "ethnicityBlack",
         Variable != "ethnicityMixed",
         Variable != "ethnicityOther",
         Variable != "ethnicityWhite",
         Variable != "socioeconomic_status_num",
         Variable != "income_num",
         Variable != "education_num") %>%
  rename("conf.low_95" = "X2.5..",
         "conf.high_95" = "X97.5..",
         "conf.low_90" = "X5..",
         "conf.high_90" = "X95..") %>%
  mutate(Model = "both + controls")

data_regressions_with_rf <- bind_rows(data_plot1, 
                       data_plot1_controls,
                       data_plot2, 
                       data_plot2_controls,
                       data_plot2_both_controls)

data_regressions_with_rf <- data_regressions_with_rf %>% 
  mutate(variable = recode(variable,
                           "Most people" = "M1:\nMost \npeople",
                           "People first time" = "M2:\nPeople first time",
                           "Stranger" = "M3:\nStranger",
                           "Keeping a secret" = "M4.1:\nKeeping a secret",
                           "Repaying a loan" = "M4.2:\nRepaying a loan",
                           "Money advice" = "M4.3:\nMoney advice",
                           "Watching a loved one" = "M4.4:\nWatching a loved one"
                                                    
                                                    )) %>%
  mutate(Model = factor(Model, levels= c("both + controls",
                                      "with controls", 
                                      "no controls"))) %>% 
  mutate(Variable = recode(Variable, 
                                                    "code_known_unknown_with_rf.L" = "Associations\n(known others = 1)", 
                                                    "code_sentiment_dichotomous_with_rf.L" = "Associations\n(negative = 1)"
                                                    )) %>%
  mutate(Variable = factor(Variable, levels = c("Associations\n(negative = 1)",
                                                "Associations\n(known others = 1)"),
                           ordered = TRUE))
  

# Reorder measures alphabetically
data_regressions_with_rf <- data_regressions_with_rf %>%
    mutate(variable = factor(variable, 
                             ordered = TRUE,
                             levels = sort(levels(data_regressions_with_rf$variable)))) 


# Add model names
data_regressions_with_rf <- data_regressions_with_rf %>% 
  mutate(n_vars = map(data, ncol)) %>%
  arrange(variable, n_vars, desc(Variable), Model) %>%
  mutate(model_name = paste0("#", 1:42))
  #mutate(model_name = paste0("#", 1:18)) # for presentations

# Replace every 6th row with row five 
for(i in c(6,12,18,24,30,36,42)){
#for(i in c(6,12,18)){ # for presentations
data_regressions_with_rf$model_name[i] <- data_regressions_with_rf$model_name[i-1]
}
# subsetact from higher numbers (simpler workaround?)
data_regressions_with_rf <- data_regressions_with_rf %>% 
  mutate(model_name_num = as.numeric(gsub("#", "", model_name)),
         model_name_num = ifelse(model_name_num>=7, model_name_num-1, model_name_num),
         model_name = paste0("#", model_name_num))
# data_regressions %>% select(variable, Variable, model_name, model_name_num)

ggplot(data_regressions_with_rf, aes(x = Variable, y = Coefficient, colour = Model, shape = Model)) +
  geom_hline(yintercept = 0, colour = gray(1/2), lty = 2) +
  geom_point(aes(x = Variable, 
                 y = Coefficient), 
             size = 2, 
             position = position_dodge(width = 0.7)) +
  geom_linerange(aes(x = Variable, 
                     ymin = conf.low_90,
                     ymax = conf.high_90),
                 lwd = 1, 
                 position = position_dodge(width = 0.7)) +
  geom_linerange(aes(x = Variable, 
                     ymin = conf.low_95,
                     ymax = conf.high_95),
                 lwd = 1/2, 
                 position = position_dodge(width = 0.7)) +
  geom_text(aes(x = Variable,
                y = conf.high_95 + 0.02,
                label = model_name), 
            position = position_dodge(width = 0.7),
            size = 2.5) +
  ggtitle("Outcome: Trust scores (std. 0-1) for different trust measures") +
  labs(y = "Linear Model Coefficient",
       caption = "Note: The figure shows point estimates for coefficients of our dummy variables of interest namely having associations with known others or negative\nassociations. Bars represent 90% (thicker) and 95% (thinner) confidence intervals. Data is the full dataset irrespective of the question or block randomization\n(details are provided in the Methods Section).")+
  scale_colour_manual(name = "Model specification",
                      labels = c("both dummies\n+ covariates", "one dummy\n+ covariates", "one dummy\nw/o covariates"),
                      values = c("#ff7f0e", "#9467bd", "#2ca02c")) +
  scale_shape_manual(name = "Model specification",
                     labels = c("both dummies\n+ covariates", "one dummy\n+ covariates", "one dummy\nw/o covariates"),
                     values = c(16, 17, 18)) +  # Use the shape codes that you prefer
  coord_flip() +
  facet_grid(variable ~ .,
             scales = "free",
             space = 'free',
             labeller = label_wrap_gen(width = 8, multi_line = TRUE),
             switch = "y") +
  scale_x_discrete(expand = c(0, 0.2)) +
  xlab("Measure") +
  theme_classic() +
  theme(legend.position = "bottom",
        strip.placement = "outside",
        text = element_text(family = "Times New Roman"),
        panel.spacing = unit(0.7, "lines")) +
  geom_vline(xintercept = 0.65, linetype = "solid",
             color = "black", size = 0.5) +
  geom_vline(xintercept = 2.35, linetype = "solid",
             color = "black", size = 0.5)

```


\newpage



## A.7 Systematic bias of trust scores

As mentioned in the conclusion, we were interested in examining whether associations differ according to respondents' characteristics, e.g., their education, income, etc. Figure \@ref(fig:correlation-matrix) displays pearsons r correlation coefficients for a set of potentially interesting variables and our two binary association variables. 


```{r correlation-matrix, fig.cap="Correlation matrix: Socio-demographic variables and association dummies", fig.align="center", fig.width=4, fig.width=4}
data_matrix <- data %>% select(code_sentiment_dichotomous, code_known_unknown, age_cat, sex, ethnicity, socioeconomic_status_num, income_num, education_num) %>%
  mutate(code_known_unknown_num = recode(code_known_unknown,
                                       `No` = 0, `Yes` = 1),
         code_sentiment_dichotomous_num = recode(code_sentiment_dichotomous,
                                       `neutral/positive` = 0, `negative` = 1)) %>%
  select(-code_sentiment_dichotomous, -code_known_unknown)
  

data_matrix <- model.matrix(~0+., data=data_matrix) %>%
  cor(., use = "everything")

x <- colnames(data_matrix)
names(x) <- colnames(data_matrix)

x <- plyr::rename(x, 
                                 replace = c("age_cat18-27" = "Age (18-27)",
                             "age_cat28-37" = "Age (28-37)", 
                             "age_cat38-47" = "Age (38-47)",
                             "age_cat48-57" = "Age (48-57)",
                             "age_cat58-80+" = "Age (58-80+)",
                             "sexMale" = "Sex (Male)",
                             "ethnicityBlack" = "Ethnicity (Black)",
                             "ethnicityMixed" = "Ethnicity (Mixed)",
                             "ethnicityOther" = "Ethnicity (Other)",
                             "ethnicityWhite" = "Ethnicity (White)",
                             "socioeconomic_status_num" = "Socioeconomic status (numeric)",
                             "income_num" = "Income (numeric)",
                             "education_num" = "Education (numeric)",
                             "code_sentiment_dichotomous_num" = "Association (Negative = 1)",
                             "code_known_unknown_num" = "Association (known others = 1)"))

colnames(data_matrix) <- names(x)
rownames(data_matrix) <- names(x)
# <-


  
    ggcorrplot::ggcorrplot(data_matrix, 
           #               sig.level=0.05, 
           lab_size = 1.5, 
           #               p.mat = NULL,
           # insig = c("pch", "blank"), 
           # pch = 1, 
           # pch.col = "black", 
           #pch.cex =0.5,
           tl.cex = 6,
           lab = TRUE) +
  theme(legend.title = element_text(size=6),
        legend.text = element_text(size=5),
        text = element_text(family = "Times New Roman"),
  # Change legend key size and key width
  legend.key.size = unit(0.5, "cm"),
  legend.key.width = unit(0.5,"cm") )
  

# robustness check using Cramers V since the above displays pearsons r, which is not neccessarily appropriate for categorical data

# known/unknown -> sentiment (stranger danger)
# library(DescTools)
# tab <- table(data$sentiment_score, data$overall_code_di)
# Assocs(tab) # same finding
# chisq.test(data$overall_code_di, data$overall_code_di)
```

Overall, we find that none of the sociodemographic variables have a meaningful correlation with our binary association measures. While this analysis is preliminary, not finding any systematic relationship might be a good indicator: although sentiment and content (known others) of associations statistically significantly influence the trust score, these associations do not emerge from specific (socio-demographic) covariates in the first place. In other words, it is unlikely that differential associations may introduce bias when we study the impact of different socio-demographics on trust scores.

\newpage

## A.8 Cross-situational trust

Figure \@ref(fig:correlations-plot) depicts a correlation matrix for our trust measures. As we might have expected, all the different trust measures correlate positively. At the same time, these correlations do not seem high enough to argue that the different trust measures tap into one single concept. One possible approach would then be to take an average across the different situational trust measures to obtain an estimate of cross-situational trust [@Bauer2018-ex].

```{r correlations-plot, fig.cap="Correlation matrix: trust measures", fig.align="center", fig.width=5, fig.width=5}
x <- data2 %>%
  select(ID_participant, variable, value) %>%
  pivot_wider(names_from = variable, values_from = value) %>%
  select(-ID_participant) %>%
  select(sort(tidyselect::peek_vars())) %>%
  cor(use = "pairwise.complete.obs") %>%
  round(., 1)

ggcorrplot::ggcorrplot(x, 
  lab_size = 2, 
  tl.cex = 6,
  lab = TRUE) +
  theme(
    legend.title = element_text(size = 6),
    legend.text = element_text(size = 5),
    text = element_text(family = "Times New Roman"),
    legend.key.size = unit(0.5, "cm"),
    legend.key.width = unit(0.5, "cm")
  )


#ggsave("correlations.pdf", width = 7, height = 6)
```


\newpage

## A.9 Regression models

Table \@ref(tab:tab-reg-1), \@ref(tab:tab-reg-2), \@ref(tab:tab-reg-3), \@ref(tab:tab-reg-4), \@ref(tab:tab-reg-5), \@ref(tab:tab-reg-6) and \@ref(tab:tab-reg-7) show regression model estimates underlying Figure \@ref(fig:coefficient-plot).

```{r regression-tables, message=FALSE, warning=FALSE}
# Omit 6th model (= repetition)
data_regressions <- data_regressions %>% 
  group_by(variable) %>%
  mutate(index = row_number()) %>% 
  filter(index!=6) %>%
  ungroup()



# Notes and rename variables
regression_notes <- "Notes: Stars indicate signifcance levels +=.1, *=.05, **=.01, ***=0.001. The dependent variable trust contains respondents' trust scores across all available measures. Hence, the dataset contains repeated observations of the same respondents on different trust measures."
regression_varnames <- c("age_cat28-37" = "Age (28-37)", 
                             "age_cat38-47" = "Age (38-47)",
                             "age_cat48-57" = "Age (48-57)",
                             "age_cat58-80+" = "Age (58-80+)",
                             "sexMale" = "Sex (Male)",
                             "ethnicityBlack" = "Ethnicity (Black)",
                             "ethnicityMixed" = "Ethnicity (Mixed)",
                             "ethnicityOther" = "Ethnicity (Other)",
                             "ethnicityWhite" = "Ethnicity (White)",
                             "socioeconomic_status_num" = "Socioeconomic status (numeric)",
                             "income_num" = "Income (numeric)",
                             "education_num" = "Education (numeric)",
                             "code_known_unknownYes" = "Associations (known others = 1)",
                             "code_sentiment_dichotomousneutral/positive" = "Associations (negative = 1)",
                         "code_sentiment_dichotomous.L" = "Associations (negative = 1)",
                         "code_known_unknown.L" = "Associations (known others = 1)"
                             )


dep_vars <- sort(levels(data_regressions$variable))


for(i in 1:length(dep_vars)){
   
  

dep_var_i <- dep_vars[i]
#cat(dep_var_i)
data_i <- data_regressions %>% filter(variable==dep_var_i)
models_i <- as.list(data_i$fit)
model_name_i <- data_regressions %>% 
  filter(variable==dep_var_i) %>%
  pull(model_name)
names(models_i) <- model_name_i



#dep_var_i <- gsub("M", "Dependent variable: Measure ", dep_var_i)

table_i <- modelsummary(models_i,
             title = 'Linear regression of trust scores (Y) on associations (Xs)',
             output = 'gt',
             notes = regression_notes,
             stars = TRUE,
             coef_rename = regression_varnames,
             gof_omit = "IC|F|Log|Adj") %>%
  #fontsize(size=10) %>%
  #line_spacing(space = 0.3, part = "all") %>%
    tab_spanner(label = dep_var_i, 
                columns = 2:6) %>%
  tab_options(
    table.font.size = 10,
    data_row.padding = px(1),
    table.border.top.color = "white",
    heading.border.bottom.color = "black",
    row_group.border.top.color = "black",
    row_group.border.bottom.color = "white",
    table.border.bottom.color = "white",
    column_labels.border.top.color = "black",
    column_labels.border.bottom.color = "black",
    table_body.border.bottom.color = "black",
    table_body.hlines.color = "white",
    table.font.names = "Times New Roman"
  )
assign(paste0("tab_reg_", i), table_i)
}

```



```{r tab-reg-1, message=FALSE, warning=FALSE, results="asis"}
tab_reg_1
```

```{r tab-reg-2, message=FALSE, warning=FALSE, results="asis"}
tab_reg_2
```

```{r tab-reg-3, message=FALSE, warning=FALSE, results="asis"}
tab_reg_3
```

```{r tab-reg-4, message=FALSE, warning=FALSE, results="asis"}
tab_reg_4
```

```{r tab-reg-5, message=FALSE, warning=FALSE, results="asis"}
tab_reg_5
```

```{r tab-reg-6, message=FALSE, warning=FALSE, results="asis"}
tab_reg_6
```

```{r tab-reg-7, message=FALSE, warning=FALSE, results="asis"}
tab_reg_7
```

